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PREFACE 


Plus ca change, plus c’est la méme chose 


Modes come and go in mathematics as in most fields. During the half-century and 
more that I have worked in the vineyard I have heard many dire predictions for the 
fate of my ideas and interests. Abstraction has been in the saddle during most of 
the time and has ridden us mercilessly. In a modest way I have taken part in this 
development. I did not believe in abstraction per se; one should know what one 15 
trying to generalize and one should show that the generalization is significant. I 
have tried to keep at least one foot on the ground while craning my neck to look 
into Heaven. Was it Heaven? There are some doubts, and the more extravagant 
claims of the abstract mathematicians to be the sole dispensers of the true faith 
and the arbiters of values are received with a healthy scepticism. A recent letter to 
the editor of the Notices of the American Mathematical Society headed “Can 
Mathematics be Saved?,”’ understood to be from the modern mathematicians, 1s 
more than a straw in the wind. “‘Applicable Analysis” is becoming sufficiently 
popular to sport its own journal. 

This book may be regarded as part of the backlash. If the book has a thesis, it 
is that a functional analyst is an analyst, first and foremost, and not a degenerate 
species of a topologist. His problems come from analysis and his results should 
throw light on analysis. The book was originally planned as an elementary intro- 
duction to functional analysis, but in working out the guiding ideas I soon found 
that my program took in only part of what nowadays goes under the title of func- 
tional analysis. On the other hand, applications to analysis were stressed and un- 
varnished classical analysis was handed out in significant doses. My interests have 
always been toward the concrete sides of analysis, a tendency that has not become 
less pronounced with the years. 

It seemed to me that I could do some useful work in giving the student a histori- 
cal perspective and in showing how the multitude of abstract concepts have arisen 
and are present in nuce already in Euclidean spaces. This required a rather detailed 
discussion of the space C" and applying the general ideas to illustrative material 
taken from the basic infinite-dimensional spaces such as continuous functions, 
functions of bounded variation, sequence, and Lebesgue spaces. The further 
development of the book reflects my own interest and research during the last 
twenty-five years. I stress complex analysis in Banach spaces and Banach algebras 
because it has meant so much to me and may benefit a budding analyst even in the 
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seventies. Fixed point theorems and functional inequalities are more recent interests 
and are eminently applicable. So are functional equations and mean values and, 
even if they come last, they are not least. A mean value may be an abstract notion, 
but it has applications to extremal problems, to set functions, to potential theory, 
and probably to more to come. Linear transformations and inner product spaces 
are more canonical items and need no justification. 

Practically the whole book has been lecture material given from Bombay to 
Kingston, R.I., under varying titles. The book is not polished; there are loose ends 
and many unsolved problems waiting for the craftsman. Such research openings 
occur chiefly in the last four chapters. There are over 850 exercises, a fair part of 
which are byproducts of the author’s own research. While problem solving should 
not become an obsession, it provides good training in research and stimulates an 
inquisitive mind to venture into new directions. 

The material in the book can serve several different purposes. Chapters 1 to 4 
and 7 to 1] can serve as text for an introductory course in functional analysis. In 
the author’s judgment it offers a fairly easy introduction and supplement to the 
more advanced books such as Dunford—Schwartz, Hille—Phillips, and Yosida. 
Chapters 5, 6, and 12 through 15 stand closer to classical analysis than does the 
rest of the book and can serve as a text for a course in aspects of analysis. 

The book has been written at the University of New Mexico, where the Depart- 
ment of Mathematics and Statistics has given friendly help and support. Many 
knotty questions have been clarified in discussions with colleagues here. B. Epstein, 
R. Hersh and R. Metzler provided many illuminating comments. I am particularly 
indebted to [h-Ching Hsu, who read the whole manuscript and has made elaborate 
suggestions for improvements. 71. B. Miller of Monash University, Victoria, has 
also read the manuscript and given me the benefit of his advice and detailed com- 
ments. The last two chapters owe much to the criticism of J. Aczél and C. T. Ng of 
the University of Waterloo, Ontario. Comments have also been received from the 
referees. To all these helpful friends I express my sincere gratitude. 

I also wish to thank the staff of Addison-Wesley Publishing Company for con- 
sideration, kindness, and understanding during the writing phase and for a work 
well done in presenting the book to the public. 


Albuquerque, New Mexico E..H, 
April, 197] 


+ J. Donaldson and I.-C. Hsu have also helped with the proof-reading. 
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Ί COMPLEX EUCLIDEAN SPACES 


Modern analysis, especially functional analysis, is concerned with analysis in 
abstract, usually linear, spaces. The elements of such a space are called “‘points”’ 
or ‘“‘vectors’’ in analogy with the usage in Euclidean spaces. The latter form the 
prototype of all abstract spaces and the theory of Euclidean spaces is a necessary 
tool for the study of the generalizations. Hence we must start our discourse with a 
study of Euclidean spaces. Much of the theory is just linear algebra and thus 
supposedly familiar to the student, but our emphasis is directed by the needs for the 
generalizations. The reader will encounter a large number of concepts which will 
play a basic role in later chapters. They will largely be just labels at this early stage 
because we are not ready for formal definitions. The reader may be assured, how- 
ever, that there is a purpose behind the introduction of this multitude of concepts. 
They serve as prototypes for later generalizations, and one has to have a fairly 
good idea of the elementary models before fruitful extensions can be attempted. 

This chapter is divided into seven sections: Euclidean three space; The space 
C"; Linear transformations; Matrices; The resolvent; Invariantive and metric 
properties; and Hermitian forms. 


1.1 EUCLIDEAN THREE SPACE 


The ordinary Euclidean three space will be denoted by Κ΄. To simplify the 
exposition we introduce Cartesian coordinates in the familiar manner. We choose 
an origin O, a unit of length, and three perpendicular lines through O referred to as 
the axes. If P is any point in space other than O, the directed line segment from O to 


P is called the vector OP or simply x, which is used both to denote the vector and its 
endpoint P. Thus we speak equivalently of the vector x and the point x. With x is 
associated an ordered triple (x,, x,, x3) of three real numbers, the coordinates of x. 
We use roman bold face type for vectors, italics for their coordinates. The 
coordinates are obtained by the following construction. We take orthogonal pro- 


-- - --» - 

jections of OP on the three axes and obtain three vectors, ΟΧ,, ΟΧ,, ΟΧ: asa 
result. Each of the three axes is given an orientation so that for that axis one 
direction from the origin is said to be positive, the opposite negative. In this way it 


ν -ν « e 
makes sense to speak of a positive vector OX,, 7 =1,2,3. The orientation and 
1 
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4 .Φ . . .Φ — 
numbering of the axes is chosen in such a manner that three positive vectors OX, 
— — 

OX,,OX, correspond to a right hand screw: if the axes are built into the screw 
. . a . - >. => =—- 
with the third axis being the axis of the screw, then a rotation from OX, to OX, 


—_—> 
carries the screw upwards in the direction of OX;3. 
We denote the Euclidean length of the vector by 


ΙΟΡΙ or ||x\l. (1.1.1) 


The symbol |/x|| is read “‘norm of x” and will be used throughout this treatise as 
notation for various generalizations of the elementary Euclidean length. We now 
set 


x,=+]OX,|,  j=1,2,3. (1.1.2) 


—> a2 Bete 
Here the plus sign is chosen if OX, has the positive direction, otherwise the minus 
sign. 
In this manner we obtain a one-to-one correspondence between the points P of 
the space and the ordered number triples 
P< (x1, X2,X3) = X. (1.1.3) 


Here x,, X2,X3 are known as the coordinates of x in the chosen coordinate system. 
We now proceed to define vector addition and scalar multiplication. 


Definition 1.1.1. If x = (x1, X2, 3), Y = (V1, ¥25 3) and if a is a real number, 
then x + y and ax are the vectors 


X+ y= (X%, +), X2 + V2, X3 + V3); 


(1.1.4) 
OX = (αχ,, OX, AX3). 
In terms of the unit vectors 
u, = (1,0, 0), 
u, = (0, 1, 0), (1.1.5) 
u, = (0,0, 1), 
we have now the representation 
x = X,U, + X7U> + X33 (1.1.6) 


which is unique since it implies and is implied by 
Χ -- (χ,, Χ. x3) 
together with Definition 1.1.1. 
Formula (1.1.6) suggests the following question. Given three vectors v,, V2, V3 


in R*, when is it possible to represent an arbitrary vector x of Κ΄ as a linear 
combination of the given vectors with constant coefficients, say 


x = S4V4 t SxV> + S3V3? (1.1.7) 
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A necessary and sufficient condition for the existence of such a representation is 
clearly that the unit vectors u,, u,, 5 can be so represented. Since all vectors are to 
be representable, u,,u,,u, must have this property, so the condition is necessary. 
On the other hand, if 


3 
υ; == 5 δ κΥ̓κΚ» J -Ξ-- 112; 3; (1.1.8) 
k=1 
then combining this with (1.1.6) leads to (1.1.7) with 
3 
sj -- Σ Χιϑδκ]- (1.1.9) 
k=1 
Now we know that the v’s are linearly representable in terms of the u’s, say 
3 
Υ; Ξ-- >. Qj, U,; J = l, 2.3. (1.1.10) 
t=1 


This is a system of three linear equations for the u’s, so our problem is solvable iff 
(read “1 and only if’’) the system (1.1.10) is solvable for the u’s for any given set 
of the v’s and this is the case iff the determinant of the system 


det (ay) # 0. (1.1.11) 


If this is the case, we can express the u’s linearly in terms of the v’s, for instance by 
Cramer’s rule. 
Now if the determinant is not zero, then the v’s defined by (1.1.10) are linearly 
independent, i.e. a relation 
Q,V, + αλλ + a3V3 = 0 (1.1.12) 


with real numbers a,, a ,a3 implies that all three numbers are zero. Conversely, 
given any three linearly independent vectors v,, V2, V3, we can then find numbers 
a,, such that (1.1.10) holds and the linear independence implies that det (a;,) # 0 so 
the u’s can be expressed linearly in the v’s. We refer to such a set of three vectors 
V1, V5, V3 as a basis of Κ΄. 

In geometrical language condition (1.1.12) expresses that the three vectors lie 
in the same plane through the origin. They are then said to be coplanar. Two vectors 
are said to be collinear if one is a constant multiple of the other so that they lie on the 
same straight line through the origin. 

Suppose that we start with any two non-collinear vectors v, and vj. They 
determine uniquely a plane through the origin. Now take any third vector v3 not 
coplanar with vy, and v,. Then the three vectors v,, V2, V3 form a basis for Ε΄ since 
they are linearly independent. 

Moreover, given any basis v,, Υ1, 75 we can construct an orthogonal basis. We 
may impose the additional restriction that the new vectors be of unit length, in 
which case one speaks of an orthonormal basis. The set u,, uz, uz introduced above 
is an example of an orthonormal basis. The following construction is a simple 
special case of what is known as the Gram-Schmidt orthogonalization process after 
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the Danish actuary Jorgen Pedersen Gram (1850-1916) and the German 
mathematician Erhard Schmidt (1876-1959). 

We start with v,, which may be assumed to be of length one; if not, divide by its 
length. This gives a new vector of the same direction as vy, and of length one. 
Consider now the plane IT determined by vy, and v,. All vectors in this plane are 
linear combinations of the form 

v=av, + Bvo. 


Geometrical intuition asserts that among these vectors there are two and only two 
vectors which are of length one and perpendicular to v,. Let one of them be 
denoted by u, and write u, for v,. Then there are constants «,, and «,, such that 


110) = ΟΥ̓. + X72V>. 


Here «,, #0 for ἃ may conceivably be a multiple of v, but it cannot be a 
multiple of u,. It follows that ἃ, and u, form a basis for the vectors in II. Now 
consider all vectors in Καὶ which are not coplanar with u, and u,. Again, geometrical 
intuition asserts that there are two and only two such vectors which are of length 
one and are perpendicular to u, and u,. We simply erect the normal to Π at the 
origin and lay off the distance one to either side of the origin. One of these vectors 
is taken as u, and it is clear that there are three numbers «,,, «3, %33 such that 


113 = H34Vy + 32V2 + %33V3, 


where a3, # Ὁ since the vectors v,,V,,¥, are not coplanar. This set of vectors 
U,, U,, u; forms an orthonormal basis of Εὖ. In choosing u, and ἃς we may make 
the choice in such a manner that the new orthonormal basis is similarly oriented 
as the vectors of (1.1.5). If this is done, there exists a uniquely defined rotation 
which keeps the origin fixed and carries the old orthonormal system into the new 
one. 

Let us get rid of the “‘geometric intuition’’ which is too feeble to support 
generalizations. What we need is the notion of an inner product which in ΚΒ 
coincides with the dot product of classical vector analysis. Any pair of vectors x and 
y in Β΄, distinct or not, has an inner product which is a real number denoted by 
(x, y). We shall specify the desired properties of (x, y) which will enable us to give 
explicit formulas for this number. The following definition makes sense in any 
inner-product space over the reals, whatever this may mean, and requires little change 
in the complex case. 


Definition 1.1.2. For any pair of vectors x, y of the space, distinct or not, there 
is a uniquely defined number (x, y) called the inner product of x with y such that 

1) (x, y) is real, 

2) (y, x) = (x,y); | 

3) (ax, + Bx2,y) = a(x, y) + BCX, y) Uinearity); 

4) (x, x) > 0 for all x, (x,x) =0iffx =0; 

5) (x, x) = |x|”. 
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Here (3) is to hold for all real numbers α, β and all vectors x,,x,,y. Let us 
remark in passing that in the complex case, considered already in the next section, 
the word “‘real’’ is to be replaced by “‘complex”’ and in (2) the right member is to be 
replaced by its complex conjugate. 

Combining (2) and (3) we see that | 

(x, ay, + By2) = a(x, ys) + B(x, y2). (1.1.13) 


The two relations (3) and (1.1.13) express that the inner product is bilinear. 


Lemma 1.1.1. If x,y € Κ΄ and make the angle 6 with each other, then 
(x, y) = ||x|l llyll cos θ. (1.1.14) 
Proof. We use the Law of Cosines on the triangle two sides of which are x and y. 


The third side, oriented from x to y, is parallel to and has the same length and 
orientation as the vector y — x. Its length is ||y — x|| and the Law of Cosines gives 


lly — xi[? = (xi? + fly? — 2 [xl llyll cos 0. 
Here the left member equals 
(y — x, y — x) = (y, y) — (y, x) — (Χ,γ) + (Χ, x) 
by properties (2) and (3). Using (4) we see that this reduces to 
Ixll* + lly? -- 2 (ἃ, y). 
Comparison of the two expressions for the length of the third side gives (1.1.14). E 


This lemma shows that the inner product (x, y) is unique and expressible in 
terms of intrinsic properties of the vectors x, y, namely their lengths and the angle 
between them. We also note 


Corollary 1. The vectors x and y are orthogonal iff 


(x,y) = 0. (1.1.15) 
Corollary 2. We have 
I(x, γ}} < [[Χ]} lly (1.1.16) 


with equality iff y is a constant multiple of x. 
This is the first instance of an inequality listed in this treatise together with a 


statement of when the inequality becomes an equality. Such considerations will 
occur again and again in the following. 


Lemma 1.1.2. Three vectors V,, V2, V3 are linearly independent iff the determin- 
ant of their inner products is different from zero, i.e. 
G = det ((v,, v,)) # 0. (1.1.17) 


Remark. This determinant is known as the Gramian. It has a geometric inter- 
pretation as the volume of the parallelepiped formed by the three vectors and is 


6 COMPLEX EUCLIDEAN SPACES 1.1 


non-negative. If the lengths of the vectors are held fixed, the volume is a maximum 
when the vectors are pairwise orthogonal, 1.e. the solid is a rectangular parallele- 
piped. The volume decreases to zero when the vectors become coplanar. 


Proof. We form the vector 
v= av, + By, + γὺ3 (1.1.18) 
and the inner products 


(Vi, V) = (Vy, V1) + BC, V2) + YC(V4, V3), 
(V2, V) = α(νχ, γν4) + B(V2, V2) + γ( 2. V3), (1.1.19) 
(V3, V) = &(¥3,V,) + B(V3, V2) + (V3, V3)- 


The determinant of this system (regarded as linear equations satisfied by «, f, y) 
is G. If G #0 and v is any given vector in Κ΄, then we can determine «, B, y 
uniquely from the system (1.1.19). This means that v,, v>, v3 is a basis for Κ΄ and 
they are linearly independent vectors. Conversely, if the vectors are linearly inde- 
pendent, then the representation (1.1.18) is unique. This implies the existence of a 
unique solution (a, B,y) of the system (1.1.19) for any given vector v, 1.6. for 
arbitrary values of the left hand side in the system. This is possible iff the 
determinant G # 0. ἢ 


The orthogonalization process can now be formulated as follows. Given three 
linearly independent vectors v,, V2, V3. Find six constants a,,1 <k <j < 3, such 
that the three vectors 

Uy = 441%; 


1) = 421V1 + 422V2, (1.1.20) 


. 13 = 431,Vy + 432V2 + A33V3 
satisfy the six conditions 


(u;, U,) = Oj, 


where δι is the Kronecker delta, 1.6. 0 if 7) # A and 1 if 7 =k. The solution is 
easily found except for the normalization factors. We have 


vi V2 
Uy, = CV, Uz = ἐλ 


3 


(V1, V1) (V4, V2) 
(1.1.21) 


Vi V2 V3 
U3 a C3 (V;, V1) (V1, V>) (V,, V3) . 
(V2, V1) (V2, V2) (V2, V3) 


Here the vector-valued determinants are to be understood as follows. Expand each 
determinant according to the elements of the first row. The result is clearly of the 
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form (1.1.20). We can form inner products of the form (w, det (-)) by taking inner 
products of the first row. Thus 


(w, v;) (w, V>) 

(Vi, Vi) (νυ. Υ2) 

In particular, we see that (v,, 2) involves a determinant with two equal rows which 

is zero for any choice of c,. This gives (u,,u,) = 0. Similarly, we see that 
(u;,v,) = 9, (u3, V2) = 0 

for any choice of c3, and this gives two more of the conditions (1.1.20), namely 


(w, 112) = Cz 


(u,, 3) Ἐξ 0, (Up, u;) = 0. 

Thus our three vectors are orthogonal, and by choosing the factors c,, C2, 63 
properly we obtain an orthonormal system. The constants are expressible in terms 
of minors of G, but we omit further details. Here the “geometric intuition” has 
been eliminated and we have a formulation of the Gram-Schmidt process which 
extends to higher dimensions as well as to more general linear spaces. 

Some further comments should be attached to the notion of the inner product. 
If (u,,u,,u3) is an ordered orthonormal basis of Κ΄ and if 


= (x1, X25 X3), Υ = (V1. V2.3) (1.1.22) 
in terms of this basis, then 


(x,y) = X1Y1 + X22 + X3V3- (1.1.23) 
This is the classical formula of vector analysis for the dot product. 
The inner product (x, y) is a function on ordered pairs of vectors to numbers. 
Such a function is usually referred to as a functional. Definition 1.1.2, (2), (3) shows 
that (x, y) is linear in both arguments. Hence it is called a bilinear functional. 


Definition 1.1.3. A linear functional on Κ᾽ is a function f(x) from vectors to 
numbers such that for all vectors x,y and all real numbers « 


fxty) =f) +fly), f(x) = af (x). (1.1.24) 


EXERCISE 1.1 


1. Verify (1.1.23). 


2. In the proof of Lemma 1.1.1 it is stated that the distance from the point x to the 
point y is ||x — y||. Use (1.1.23) to verify that this is indeed the Euclidean distance. 


3. Formula (1.1.16) states that (x, x) (y, y) — (x, y)> >0. If x and y are given by 
(1.1.22) show that this is equivalent to 


3 2 3 3 
(Σ xp)) < ¥ χὖ γῇ. 
J=1 j=l j=l 


This is a special case of the so-called Cauchy inequality [after Augustin Louis, Baron 
de Cauchy (1789-1857)]. What happens to the condition for equality? 
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. Show that the principal minors of the Gramian G are positive. A principal minor of 


a determinant is obtained by striking out rows and columns with the same subscripts. 


. In formula (1.1.21) the coefficients c j are expressible as square roots of principal 


minors. Find these expressions. 


. Try to prove that G is the volume of the parallelepiped defined by the vectors 


Vi, V2, V3. 


. Justify the method for forming inner products when one of the vectors is a vector 


determinant like (1.1.21). 


. While the dot product is a function from vectors to numbers the so-called vector or 


cross product of vector analysis is a vector. With the usual notation 


Uy; UD 113 
x X y= xy X45 X3 
» 7) 53 


In vector analysis it is customary to denote the unit vectors by i, j, k instead of 
u,, 112,3. Show that x x y = —y x x so that this type of multiplication is non- 
commutative in the sense thatx x y A y X x. Note also that x x x = 0 forall x, 
50 a product can vanish without either factor being zero. 


. Verify thati x j = k,j x k =i,k xi =j. 
10. 


A linear functional on Κα was introduced in Definition 1.1.3. Such a functional is 
said to be bounded if there is a fixed positive number M such that | f(x)| < M]|x|| 
for all x. Show that there exists a fixed uniquely determined vector y such that 
f(x) = (x,y). [Hint: Specify the action of f on a basis and use linearity. ] 


1.2 THE SPACE Cc" 


We have given a fairly thorough discussion of the space Καὶ except for linear 


transformations, which are postponed until Section 1.3. Here we take up the passage 


from three dimensions to n and from real coordinates to complex. 


We say that Κ΄ is a three-dimensional vector space because there are sets of 


three linearly independent vectors, but any set of four or more vectors is 
necessarily linearly dependent. We now proceed to define C". 


Definition 1.2.1. The space C" is the set of all ordered n-tuples of complex 
numbers (21,22, ...,Z,) = Z with the following conventions concerning equality, 
algebraic operations, and metric properties: 

1) z=wiffz; =w,, FH 1y2,.05N; 

2) Z+ w= (2, + Wy, Ζ + Wo, 0005 Zn + Wy)? 

3) aZ = (421,42 ,...,02Z,), & any complex number ; 
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4) There is an inner product 


(z, Ww) = > z;W, 


where the bar denotes the complex conjugate , 
5) (2,2) = [2]}5, and 


6) The Euclidean distance between z and w is |z — Ν᾽]. 


The space R" is obtained from C” by restricting the coordinates Ζ; to be real 
numbers. This also means restricting the multipliers « to be real numbers and the 
bar can be omitted in (4). 

The n-tuples are still regarded as vectors and are also spoken of as points or 
elements of C". In C" we have a definition of addition, vector addition, given by (2). 
There is also a notion of scalar multiplication defined by (3). The symbol ||z|| (read 
“the norm of z’’) is the natural generalization of the length of a vector or of the 
distance from the origin to the point z. We speak of Euclidean distance in (6) 
because that is what it reduces to for n = 2 or 3 and also because, as we shall see 
later, there are other possibilities of defining a useful notion of distance in the 
space C”. 

From (4) we get 


(w, Z) = (Z, νη). (1.2.1) 


The definition of the inner product is in agreement with Definition 1.1.2, where we 
replace “real’”’ by “‘complex.”” The bilinearity of the inner product is expressed by 
the formulas 

(az, + Bz, Ww) = «(z,, Ww) + B(Z2, Ν᾽), (1.2.2) 


(z,aw, + Bw2) = &(z, w,) + B(Z, W2). (1.2.3) 


Note that the multipliers « and β are replaced by their complex conjugates in the 
right member of (1.2.3). 

Lemma 1.1.1 remains valid in R", for two vectors x and y determine a two- 
dimensional plane formed by all the vectors 


ax + By 


with real « and f, and the Law of Cosines may be applied to the triangle with sides 
of lengths ||x||, lly|], and [ΠΥ — x||. On the other hand, the lemma normally does not 
make sense in C” since (z, w) is complex valued. 
Corollary 1 of Lemma 1.1.1 becomes 
Definition 1.2.2. The vectors z and w are said to be orthogonal if 
(z, w) = 0. (1.2.4) 


Corollary 2, on the other hand, becomes a theorem to be proved. 
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Theorem 1.2.1. For any two vectors z and w in C" 
\(z, w)| < {{4|} lw (1.2.5) 


with equality iff w = yz for some complex number y. 


Proof. Let a be a complex number to be disposed of later and consider 
Πνν — az||?. This is a non-negative real number and zero iff w = az. Then 


lw — az||? = (w — az, w — az) 
= (w, w) — &(w, 2) — a(z, w) + [αὖ (2, 2). 
Note that the second and the third terms of the last member are complex con- 


jugates. If now (z,z) = 0, then z = 0 and (z, w) = 0 so that (1.2.5) is trivially true 
in this case. If (z,z) # 0, we can take 


_ (w,2) 
“GD 
and obtain : 
| (w, z)| 
ἀν ala 


and this implies (1.2.5). i 
Using Definition 1.2.1 we see that (1.2.5) may be written 


< lel? Σ bw? (1.2.6) 


with equality iff w; = yz,, 7 =1, 2, ..., ἢ for some fixed y. This is the general form of 
Cauchy’s inequality. 


Definition 1.2.3. A set of k vectors 2,,2Z,...,%, in C" is linearly independent 
over the complex field C if 


CyZy + Ογχ +++ + OZ, = 0 (1.2.7) 
for complex numbers ¢,, C2, ..., C, implies 
ὃ᾽] = Cp ΞΞ =e, = 0. 
The set is linearly dependent if k numbers c 1, C2, ..., C,, not all zero, can be found 


for which (1.2.7) holds. 


Note that we are here considering linear independence over the complex field 
while in Section 1.1 the real field R was used. The two vectors (1,0) and (i, 0) 
belong to C*. They are independent over Καὶ but not over C. 
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Theorem 1.2.2. There exist sets of n vectors in C" which are linearly independent 
over C but any set of n+ vectors is linearly dependent. 


Proof. Let u, denote the unit vector which has a one in the jth place and zeros 
elsewhere. Then 

n 

δ, CM; = (Cy, C2, ..-5 Cy) = €. 


J 


This is the zero vector iff all the c,’s are zero. Thus the set {u,} is a set of n vectors 
in C” linearly independent over C. On the other hand, given n+ 1 vectors 
V 45 Vostces Nees S4y 


Υ = (Ci; Crk» me as ἢ k = 12; edagelt + Ι, 


we can find n + 1 numbers c,, 61, ...,C, 41, not all zero, such that if we multiply the 
kth vector by c, and add the results, we obtain the zero vector. The conditions 
that have to be satisfied are 


n+1 


>» CinCk = 9, On ears 
k=1 


This is a system of n linear equations in n + 1 unknowns ¢,,¢),...,C,41- Since 
the system is homogeneous, there exist non-trivial solutions, 1.6. there is at least 
one set {c,} where not all c,’s are zero, such that 


n+1 


>, CyVi -- 0 
k=1 
and the vectors v,,V>,...,V,4, are linearly dependent over Ὁ. Ε 


We say that C” is n-dimensional over C since we can find n linearly independent 
vectors and any larger set is linearly dependent. 
A set of n linearly independent vectors v,, V2, ...,V, forms a basis of C", 1.6. 
every vector v € C” can be written in a unique manner as a linear combination of the 
9 
Vv, 5, Say 


¥= 
k 


CEE (1.2.8) 
1 


n 
with complex coefficients c,. There is such a representation, for the ἢ + 1 vectors 
V,V,,V2,---5V, are linearly dependent while the set v,,Vv,...,v, 15 made up of 
linearly independent vectors by assumption. This gives a relation of the type 
(1.2.8). There is one and only one such representation, for if there were two, then 
there would be a linear relation between the basis vectors and this is absurd. 
Hence every vector v € C” has a unique representation in terms of the given basis. 
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Just as in Κ΄, we can replace a given basis by an orthonormal one using the 
Gram-Schmidt process. Formula (1.1.21) generalizes right away and we find for 
k =1,2,...,n that 


vi V> a Vi 
(Vi,Vi) (σι,,γ2) ... (V4, Vg) 
u,= Cc | (V2, Vy) (V2,V2) ... (Vo, V) (1.2.9) 
(Ve 15V1) (Ve-15 V2) ..- (Ve~ 15 Ve) 


where c, is a real normalization factor. The verification is left to the reader. 
Finally, a definition of an important concept which we shall meet over and over 
again. 


Definition 1.2.4, A subset Καὶ of C" is said to be convex if x,y € K implies that 
the points tx+(1—t)yeK where 0<t<1. These points form a line 
segment joining x and y. 


EXERCISE 1.2 


1. Prove the ΝΑ ΔΝ δὸ property of the norm, namely the inequality 
lz + wl < [zi] + [lw], 
and decide when equality holds. [ Hint: Use the inner product and Theorem 1.2.1.] 
2. Prove the triangle inequality for distances in C” 
Iz — wll < |z — vl + lv — wl 
and decide when equality holds. Why the name? 


3. State and prove the analogue of Lemma 1.1.2. 
4. For arbitrary vectors z and w in C” prove the Parallelogram Law 


lz + wil? + 112 — wil? = 222 + Π|ν}}]. 
Why the name? 


5. For any three vectors z,, Z,, Z3, prove that 


lz, — 22 + [12 — 232 + [125 — 2, (7 + lz, + 2. + χε} 
= 3{2,}7 + [227 + [{{23}7]. 
6. In the preceding problem, suppose that z,, z,, Z, are points on the unit circle in the 
complex plane. Use the formula to discuss the following problem in maxima and 


minima with a side condition. A triangle is inscribed in a circle of radius one. What 
triangle would maximize the sum of the squares of the lengths of its sides? 


7. Extend the result of Problem 10, Exercise 1.1, to C”, i.e. prove that a linear bounded 
functional f(z) on C” is necessarily an inner product, f(z) = (z, w) for some fixed w. 


1.3 


ΝΜ 


10. 


11. 


12. 


13. 
14. 


15. 


16. 
17. 


1.3 
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Given a set of π linearly independent vectors in C”, show that any non-void subset is 
also linearly independent. 


Convergence in the complex plane is reduced to convergence of positive numbers by 
the convention that lim 2; = Zo iff lim |z; — Z| = 0. Similarly for vectors in Cs; 


j7o j7o@ 
Here lim Z; = Zo iff lim IE — Zo || ΞΞ 0. Suppose Z; = (2; 2j2> “609 Z jn) and 
jo jro 
Zh = (Zo15 ZQ2> "559 Zon): Show that lim Lj = Zo iff lim Z jk = Zok fork = 1, Zz ere ἢ. 


In R" the unit hypercube is the set of points of coordinates between 0 and 1, limits 
included. Show that this is a convex set. 


Show that the set of points (z; ||z|| < 1), known as the closed unit sphere or the unit 
ball, is a convex set. 


If K is a convex set in C”, show that for any choice of m points Z,, Zz, ..., Ζμ In K 
and m numbers a, such that0 < α; <1, fj =1, 2,..., m, and aE a; =1, then 
ya 42; K. 

Prove that the intersection of two convex sets is convex or void. 


The convex hull H of a set S in C" is the intersection of all convex sets containing S. 

Show that H is the set of all finite sums of the form ))7= 1 XZ; withz;EéS,0 < a;< 1, 
ja1 0; = 1. 

A set Cy - C" is called a cone ifz€ Cy, « > 0, implies that az € Cy. A cone contains 

the origin. It is said to be proper if ΖΕ (ρ., —z€Cp implies z = 0. Find when a 

cone iS convex. 


Verify (1.2.9). 


The Gramian of ἡ vectors v, in C” is by definition 
G = det [(v,, v,)] = det (9 jx). 


Show that g,; = 9,; and prove that G and all its principal minors are real and 
different from zero. Are they positive? 


LINEAR TRANSFORMATIONS 


We shall have to use some of the conventions and notations of intuitive set theory 
some of which have already figured implicitly in the preceding. A set S is a 
collection of objects denoted by some symbol like x. Here the nature of the objects 
is immaterial and only the membership counts. We demand that one and only one 
of the statements ‘‘x is a member of δ᾽ and ‘‘x is not a member of δ᾽ be true. If the 
first one is true, we write x € δ᾽; if the second, x ¢ S. This implies that two sets are 
equal iff they have the same members. A set S, is a subset of S, written 


Sue's: (1.3.1) 


if all members of S, are also members of δ. It is a proper subset, written δ, < δ, 
if (1.3.1) holds but there are some elements (= members) of S not in δὲ. 
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Consider now two sets S, and S, and a collection T of ordered pairs [x, y] 
such that (1) x e S,, y € Sp, (2) every xp € δ᾽, is the first element of at most one pair 
[x,y]. Such a collection T of ordered pairs is a mapping from S, into δ). If 
[Xo, Yo] € T, we call yo the image of x, under the mapping T and xz is the inverse 
image or pre-image Of yo. The set of all elements x ε S, which actually occur as 
first elements of pairs [x, y] in T forms the domain of T while the elements y € S, 
which are present as second elements of pairs [x, y ] form the range of T. In symbols 


D[T] = D = {x;xeS,, [x,y] eT for some y}, 
(1.3.2) 
RCT] = R= {ys γε δ;, (x, y]e T for some x}. 


We shall now be concerned with a class of mappings of C” into itself, known 
as linear transformations. Here S, = δ) = C”, the domain will be all of C” while 
the range may be a proper subset. 


Definition 1.3.1. A is a linear transformation from C" into itself, if (1) to each 
vector x € C” corresponds a unique vector y = A(x) in C", and (2) for each pair 
of complex numbers a, B and each pair of vectors X,, X, 


A(ax, + BX>) = #A(x,) + BA(x,). (1.3.3) 


The range of A 15 the set of all vectors A(x) and (1.3.3) shows that R[A] is a 
linear subspace of C". Verify! 

We define the null space of A, also called the kernel of A, as the set of all vectors 
x which are mapped into the zero vector 


NLA] = [x; x EC", A(x) = 0]. (1.3.4) 
Since 
A(0) = A(O + 0) = 2 A(O) 
we see that 
A(0) = 0 (1.3.5) 


so that the zero element always belongs to St[ A]. 

Formula (1.3.3) shows that Jt{A] is a linear subspace of C”. (Why?) The 
transformation A turns out to be particularly simple when the null space reduces 
to the zero element, N[A] = {0}. Then R[A] = C” and the mapping is one-to-one 
so that each y is the image of one and only one point x. In the collection of ordered 
pairs [x, y] which defines A, of course, each x occurs once and only once, but now 
each y occurs once and only once. Thus the collection of pairs [y, x] is also a 
mapping of C” into itself, the so-called inverse mapping A~*, and this mapping is 
also a linear transformation. All this will be proved below. 

It should be pointed out that 91 [4] does not have to reduce to the zero element. 
If NLA] τέ {0}, we shall see that R[A] is a linear subspace and the mapping is no 
longer one-to-one. In an extreme case, the range may reduce to the zero element, 
NLA] = C" and A “annihilates”? the whole space. 


1.3 LINEAR TRANSFORMATIONS 15 


We start by proving 


Theorem 1.3.1. The linear mapping defined by A is one-to-one iff NLA] = {0}. 


Remark. We write ‘“‘1-1”’ for “‘one-to-one.”’ This property expresses that x, # X, 
implies A(x,) # A(x). 


Proof. If for some pair of distinct vectors x,;, x, we should have A(x,) = A(x), 
then by the linearity of A 

A(x, — X,) =0 
and x, — x, ENA]. Conversely, if NLA] = {0}, then A(x,) = A(x,) for x, # x, 
leads to a contradiction so the condition MLA] = {0} is necessary and sufficient for 
A to be 1-1. ἢ 


In the modern terminology of N. Bourbaki the term injective is used to 
designate a 1-1 transformation (an injection) and surjective (surjection) for a 
transformation which is onto, i.e. whose range is the whole space. 


Theorem 1.3.2. R[A] = C" iff NLA] = {0}. 


Proof. For this fact we shall give two proofs. The first, based on a personal 
communication by Bertram Yood, is abstract and proves the statement in a few 
lines. The second is computational and leads to the representation of A by a matrix 
which is basic for the following. 


I. Let u,, uy, ..., u, be a basis for C”. Suppose that NLA] = {0}. Then the 
n vectors A(u,), A(u,), ..., A(u,) are linearly independent. For if 


Σ᾽ c;,A(u;) = 0, then A( Σ ἐμ} =0 and ) cu,=0 

j=1 j=1 j=1 

and all the c,’s are zero since {u,} is a basis. But then R[A], a linear subspace of C”, 
contains a set of ἡ linearly independent vectors, the A(u,), and is thus of dimension 
n. Hence R[A] = C”. On the other hand, suppose that RLA] = (΄. Then 9ὲ[{4] 
contains a set of n linearly independent vectors v,, V2,..., V,. These are images of 


vectors u; in C” under the mapping A, say v; = A(u,). Then 


Σ, ΟΝ = Σ, ¢A(u) = a( 2 ci) 

j=1 j=l j=1 
The first member can be the zero vector iff all c; = 0. Hence A(v) = 0 iff v= 0 
or NLA] = {0}. 


II. Inthe second proof we determine the structure of A. We take as the basis 
for C” the unit vectors u,, where u, has a | in the jth place, all other entries being 0. 
The mapping A is uniquely determined by its effect on the unit vectors plus the 
assumption that A is linear. Since A(u,) is a vector in (", it is a linear combination 
of the basis vectors. This implies the existence of n* numbers a, such that 


A(y) = Σ ayu, k=1,2,..,0. (1.3.6) 
j=i1 
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It is clear that this set exists and is uniquely determined by A. 
Suppose now that 


v= » OU; (1.3.7) 


is the representation of the vector v. By the linearity of A 
n 


w= Av) = Σ μας Ὁ ΝΣ ant} => | Σ abs} u. (38) 
k=1 k=1 j=i1 J=1 \k=1 


— 


This means that in the chosen coordinate system the mapping A is completely 
determined by the array a,,, j,k = 1,2, ...,n, orn? complex numbers. We introduce 
the symbol 


Qty (αι ... Ay, 


A = 4.1 >> eee a>», (1.3.9) 


wpaeeevreeoee ene eeeeaeeee 


and refer to A as the matrix whose entry in the place (j, k) is (4), = αμκ. We also 
write A = (a,,) and speak of the jth row and the kth column of A. We write 
symbolically 

w=t-v (1.3.10) 


and think of the right hand side as a “‘product’’ of a matrix with a vector. The 
result is again a vector, namely w = (Wy, Wy, ..., W,,); 


Ww; -Ξ ny Aj Dy. (1.3.11) 


Let us now again examine the implications of the hypothesis R[A] = {0}. 
This says that the homogeneous system of ἢ linear equations in n unknowns 


> ayd,=90, jf =1,2,...,0, (1.3.12) 
k=1 
has the unique solution 
ὃ, τῷὸ ὃ) =--- =0, = 0. (1.3.13) 


For this to be the case, it is necessary and sufficient that the determinant of the 
system is different from zero 
det (a;,) # 0. (1.3.14) 


By definition this is the determinant of the matrix A. 
Now condition (1.3.14) implies that the non-homogeneous system 


SY ayb, = W; FHI) 5 on: (1.3.15) 


has a unique solution (6,, 65, ..., ὃ.) = v for each given vector w = (Wy, Wo, ..., W,) 
of Οὐ, Hence ®[A] = C’. 
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On the other hand, if R[A] = C’, then the system (1.3.15) must have a 
solution vector v for each given w. This implies that (1.3.14) holds and v = 0 is the 
only solution of (1.3.12) or N[A] = {0}. ff 


The matrix A is the representation of A under the chosen coordinate system. 
Changing the basis changes the representation matrix. The passage from one basis 
to another also involves a matrix transformation, so we shall have to consider the 
composition of linear transformations and of matrices in the next section. 

We refer to A as ann by n matrix and denote the set of all such matrices by Wt,,. 
The structure of this set will also be examined in the next section. 

If det (a;,) # Ὁ we say that the matrix A is regular, otherwise singular. The 
singular case calls for more detailed comments. A square submatrix of # is one 
obtained by crossing out the same number of rows and columns of A. We say that 
# is of rank n — q if there is a non-singular submatrix of σέ with ἢ — q rows and 
columns but all submatrices with more than n — g rows and columns are singular. - 
The number q is known as the nullity of the matrix (and of the determinant and 
the system of equations). 

Using dim (X) as notation for the dimension of a linear space X we have 


dim {NLA]} = q, (1.3.16) 


for now the system (1.3.12) has gq and exactly q linearly independent solution 
vectors. The nullity also crops up in the discussion of the system (1.3.15) where 
now the right hand sides can no longer be chosen arbitrarily, but must satisfy 
linear relations of the form 


CiWy + CoW2 +++: + σον, = 9, (1.3.17) 


qin number. These are conditions which the image w = A(v) must satisfy, 1.6. 
restrictions imposed on R[A]. The range space 15 now of dimension ἢ — 4, 
dim {R[A]} =n-—q. Various interpretations may be put on the conditional 
equations (1.3.17). Each of them is the equation of a hyperplane through the origin 
in C”. Equivalently (1.3.17) figures as the null space of a linear functional 


Fw) = (ν,Ὁ, C= (Gy logy wees 6). (1.3.18) 


Thus it is seen that there is a profound difference between regular and singular 
mapping matrices. If € is regular, the mapping is 1-1 and onto and there is an 
inverse with the same properties. If # is singular, then all of C” collapses onto a 
proper linear subspace. Only points of this space are images under A, i.e. occur as 
second elements of pairs [x,y] but each such y figures as second element in 
infinitely many pairs. Finally, we restate one of the main results of this section as 


Theorem 1.3.3. Once a basis has been chosen for C", there is a 1-1 correspond- 
ence between transformations A of ©(C") and matrices A of M,. Here €(C") 
denotes the set of all linear bounded transformations from C" to itself. 
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EXERCISE 1.3 


10. 
11. 


1.4 


. Give an example of a 3 by 3 matrix of rank 2. Find its range and nullspace. 


. If K c C” is a convex set, show that its image A(K) under the mapping A € &(C") 


is also convex. 


. Let E = {z; |\z|| <1} be the unit ball of C”. Show that it is convex and prove that 


no point of A(E), A e€(C"), can have a distance from the origin exceeding 


(pea lal 5. 


. A linear transformation U ε (( 5) which preserves inner products, (Uz, Uw) = 


(z, w), is called wnitary. Prove that distances are preserved. Take an orthonormal 
basis for C” and let “UL be the matrix representing U in this system. ‘UL is called a 
unitary matrix. Show that its column vectors form an orthonormal system. Take 
‘Ub = (u;,) and prove that “UL is regular. 


. What can be said about the row vectors in the unitary case? The jth row vector 


is the vector (" jy, Uj2, ++» Ujn) 


. What is the value of det (u;,)? 


. In the unitary case the unit ball is obviously mapped in a 1-1 manner onto itself. 


Why? What is the bound given in Problem 3 when a, = uj? 


_ In the notation of Problem 17, Exercise 1.2, let G be the matrix corresponding to the 


Gramian G. When is such a matrix unitary? 


What is the argument that leads to formula (1.3.17) when the determinant of the 
matrix “ is zero? Justify the assertion that the number of independent such 
relations equals q. 


How is formula (1.3.18) obtained? ᾿ 


If A is singular, prove that if y is second element of a pair [x, y], then y is second 
element of infinitely many such pairs. 


MATRICES 


Let us consider the set IN, of all n by n matrices over the complex field. Since 
M, = C! = C, we assume n > 1. Does the set M,, have any discernible structure, 
algebraic or geometrical? To study this question we consider simultaneously the 
set €(C") of all linear bounded transformations from C" to C". This set is in 1-1 
correspondence with the set Jt, once a basis has been chosen for C”. 


Now the set €(C”) may be endowed with an algebraic structure in a natural 


manner. We can define addition, scalar multiplication, and element multiplication 
for linear transformations in €(C") in a way which extends immediately to more 
general cases. 
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Definition 1.4.1. For any vector x € C", any transformations A, B ε ©(C") and 
any complex number « set 


(A + B)(x) = A(x) + B(x), (1.4.1) 
(aA)(x) = aA(x), (1.4.2) 
(AB)(x) = A[B(x)], (1.4.3) 


Here the right hand sides are meaningful and serve to define the transformation 
in the left member: A + B, «A, AB, respectively. The meaning of the first two 
transformations is obvious. In the third case we first let B act on x and obtain a 
vector y = B(x). We then let A act on y. Since the domain of A is all of C", we can 
form A(y) = z. Now the passage from x to z is clearly a linear transformation from 
C” to C", i.e. an element of €(C"). We define this transformation as the product of 
A and B in this order. It is important to observe the order for normally 
A[_B(x)] # B[A(x)] so that 

AB # BA. (1.4.4) 


Thus we are dealing with non-commutative multiplication in €(C"). If, exceptionally, 
AB = BA, these two transformations are said to commute. Note that A commutes 
with its own powers and that the law of exponents is valid 


ASA‘ = 473 (1.4.5) 
The successive powers are defined recursively by 
41: 1(Χ) = A[A/(x)]. (1.4.6) 


We observed above that there is a 1-1 correspondence between linear trans- 
formations A and matrices A once a particular, say orthonormal basis 
{u,; k =1, 2, ..., n}, has been chosen. We can now define the algebraic operations 
for matrices in such a way that the correspondence preserves the algebraic 
operations, 1.6. sums go into sums, scalar products into scalar products, and products 
of elements into products of elements. Such a correspondence is known as an 
isomorphism. What is required is that 


shall imply that 
A+ BOA+8, wAcat, ABOAS. (1.4.8) 


This leads to the following definition of the algebraic operations on matrices: 
A “++ B ἘΞ (aj, + b ix); (1.4.9) 
ast = (αα ;,), (1.4.10) 


m=1 
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Only the last convention needs justification. We have 
y = &x, Z= Ay 


or, in terms of components (= coordinates), 


n 
Vn = 3 Dink Xk 
k=1 


n n n n n 
2) = Sain In = Sim Σ Oma z= Σ [Σ din δι} Ss 
m=1 m=1 k=1 k 1 


so that c,, has the value stated in (1.4.11). 

A set in which the operations of addition, multiplication by scalars, and 
multiplication of elements may be performed is known as an algebra provided the 
operations satisfy certain simple conditions which will be specified later (see Sections 
2.1 and 2.6). It is an algebra over the complex (real) field if the scalars are complex 
(real) numbers. It is a non-commutative algebra if element multiplication is non- 
commutative. 

The reader undoubtedly realizes that he has encountered a number of algebras 
in his earlier study of mathematics. Thus functions of a real variable which are 
continuous at a fixed point form an algebra. So do the differentiable functions. 
Functions of a complex variable which are holomorphic in a given domain form an 
algebra. All these function algebras are commutative in the sense that fg = gf for 
all elements fand g of the algebra. On the other hand, as shown in Exercise 1.1, 
the vectors in Κ΄ form a non-commutative algebra under the operations of vector 
addition, multiplication by real scalars, and the forming of cross products. Here 
we have now two new examples of non-commutative algebras, namely, &(C") 


and Wt, 
The algebra €(C”") is associative in the sense that 
(A+ B)+C=A+4+(B+O), (1.4.12) 
(AB)C = A(BC). (1.4.13) 


It is also distributive, meaning that 
(A + B)C = AC + BC, A(B + C) = AB + AC. (1.4.14) 


The same properties hold for the corresponding elements of Wt,. Verification is left 
to the reader. These properties should not be treated as trivial and self-evident. 
Thus, for example, the vector algebra in Κ΄ mentioned above is non-associative 
since the relation 

(x X y) xX Z#X x (y X Z) (1.4.15) 


may very well hold. A case in point is given by x = i, y = z = j, where the left 
member is —i while the right is 0. 
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An algebra may have a unit element, i.e. an element which is neutral for 
multiplication. If there is such an element, it is unique. Both &(C”) and M,, have 
unit elements. There is a linear transformation which leaves C” pointwise invariant, 


y=x or y= [(Χ), (1.4.16) 

known as the identity mapping. We have clearly 
IA=AI=A (1.4.17) 
for all A. There is a corresponding matrix 6, known as the unit matrix, such that 
AE = δέ = A (1.4.18) 
for all A. ὃ has ones along the main diagonal, all other entries are zero, i.e. 


(6) jn = Oj 
In an algebra there may be inverses and quotients, but it 1s to be expected that 
the case where all elements have inverses, the zero element excepted, is of rare 
occurrence. We say that A has an inverse B if there exists an element Be €(C") 
such that 
AB = BA = I. (1.4.19) 


We then write B = A~*. Note that B is a linear transformation by definition since 
_ the only inverses accepted at this stage must be members of €(C"). If B exists, then 
it is unique and, as will be shown later, B is 1-1 and onto. 
To the inverse transformation B corresponds a matrix & which is the inverse 
of the matrix # so that 
AL = LA = δ. (1.4.20) 
We then write 8 = A~?. 


Lemma 1.4.1. If B and & exist, they are unique. 


Proof. tis enough to prove the case when B € &(C”). The matrix case is handled 
in the same manner. Suppose that B exists as an element of €(C”") and satisfies 
(1.4.19). Suppose there is a transformation C € €(C") such that 


AC =CA= I. 
We have then 
B(AC) = (BA)C = IC =C = BU) =B or C=B 
as asserted. fj 


Lemma 1.4.2. If B exists in €(C"), then B is 1-1 and onto. 


Proof. By assumption B is linear. The domain of B in C” is the range of 

A: D[B] = [4]. Now a necessary and sufficient condition for A to have an 

inverse is that N[A] = {0}. By Theorem 1.3.2 this implies and is implied by 

R[A] = C" so that D[B] = R[A] = C”. We must have N[B] = {0}, for if 
Biz) = 0, then (AB)(z) = I(z)=z=0. 


This implies R[B] = C", so that B is onto as well as 1-1. i 
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In both these lemmas the existence of the inverse is assumed. We have now to 
face the question of finding a simple criterion for the existence. This is easiest to 
handle for the matrix case. 


Theorem 1.4.1. The matrix &€ has an inverse iff A is regular, i.e. det (aj) # 0. 


Proof. We recall from the elementary theory of determinants that the product of 
two determinants may be written as a determinant, in fact in four different ways. 
One of these expresses that 


det (4) det (B) = det (ASH), (1.4.21) 


where A = (aj), B= (bj), AB = (Cj) = (Lin=1 TmPm) If now AB = δ, the right 
member of (1.4.21) equals 1 and 


so that det (a;,) # 0 15 a necessary condition for the existence of an inverse matrix. 
This means that € is regular. In passing let us notice that 


det (δ) = [det (ay)]7?. 


In particular, & is also regular. 

On the other hand, if A = det (a,,) # 0, the inverse matrix can be constructed 
using elementary facts in the theory of determinants. We recall that a determinant 
may be expanded according to the elements of any one of its rows or columns. 
Thus for the jth row 

A = ajyAjy + Aj2Ajz + ++ + Aj Ajns (1.4.23) 


where A,, is known as the cofactor of a;,. On the other hand, multiplying the 
elements of the jth row by the cofactors of the pth row with p τέ j and adding gives 0 
instead of A, so that 

0 = aj,Apy + Gj2Ap2 Ὁ τ + jn A (1.4.24) 


jnt* pn 
Here A # 0. Define a matrix B = (δι) with 
by, = 4,, (Δ). (1.4.25) 


J 


and form the product σε. The element in the place (j,k) is 
(αι Any + Qj2Ag2 + τ: + GjpAgn) (Δ) ἢ 


and this equals 6,,. Hence 4% = δ. Expanding by columns instead of by rows, we 
obtain RA = 6. Thus if A = det(a,;,) # 0, then A~* exists and is the matrix ZB 
defined by (1.4.25). 

Let us see now how this result applies to the theory of linear transformations 
from C” to C". If A = det (a,,) 4 0 so that # is regular, then the corresponding 
transformation A, which is uniquely determined by # for the given basis, also has 
an inverse A~‘. By Lemmas 1.4.1 and 1.4.2 the inverse is a uniquely determined 
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element of G(C"). Further, A~* is 1-1 and onto; if 


y=A-x, Vx, (1.4.26) 
then 
x= A !-y, Vy, (1.4.27) 


is the representation of the inverse transformation A~*. We recall that V is read 
“all” or “for all.” fj 


Let us return to (1.4.26) and drop the assumption that # is regular. Let B be 
any regular matrix in Mt,. Then 


Bey = RA+x = (RAB *)(Hx). (1.4.28) 
This holds for all x and y satisfying (1.4.26). The matrices A and 
A, = BAB (1.4.29) 
are said to be similar. The relation is symmetric for 
A = BA, SB. (1.4.30) 
Corresponding to the matrix 4, we have an element A, of €(C") and 
A, = BAB™', A = B7'A,B. (1.4.31) 


Here also we speak of similar transformations A and Ay. 


Theorem 1.4.2. If & and A, are similar n by n matrices, then there exist two 
ordered bases of C,, say F = {u;; j =1,2,...} and F, = {v,;; 7 =1,2,...,m} so 
that the action of A, on Ἐς is described by the same transformation as the action 
of A on F. In other words, if 


Au, = Σ a i, Uk, -}, να ΝΣ (1.4.32) 

ἀξ 1 

then ᾿ 
σξ, «Ὑ ΞΞ bs Qik Vio j=1, 25. 2 ἐν); (1.4.33) 

k=1 


Proof. We may assume that F is given and that the action of 4 on F is defined by 
(1.4.32) where A = (α μ). Since A and A, are similar, there exists a regular matrix B 
such that (1.4.29) holds. Define 


v, = Bu, (asl Se err; (1.4.34) 
and set F, = {v,;j =1,2,...,m}. This is an ordered basis for C” since any relation 
jHl 
implies ὴ Ξ 
j=l j=1 


for Φ is regular and hence 91 [8] = {0}. Now (1.4.35) holds iff c; = 0, V/. 
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Now consider 


Ay *V; = [BAB] v, = [BA]LB-'v,] = [BA] u, 


n 


n n 
= 4 Σ as — > Aj, ᾿ Ux = vs AixVy- 
k=1 k=1 k=1 


Thus to any given ordered basis F of C” and any pair of similar matrices A and A, 
we have constructed an ordered basis F, such that the action of A, on F, is the same 
as that of A on F. §j 


We have, of course, similar relations between bases and similar transformations 
A and A,. 

Let us now consider some operations that may be performed on matrices. 
To a given ἡ by n matrix « = (a;,) we order two other matrices A‘ and A* known 
as the transpose and the conjugate transpose of A. Here 


ἀμ παρ» A”) in = ἀρ (1.4.36) 


where the bar as usual indicates the conjugate complex. If the a, κ are real, A’ = τ", 
We say that A is symmetric, if 

eae (1.4.37) 
Hermitian if | 

A* = A. (1.4.38) 


The latter case is named after the famous French mathematician Charles Hermite 
(1822-1901). 

The Gramian defined above is an example of a Hermitian matrix. As we shall 
see later (Section 1.7), Hermitian matrices have many important properties. The 


transformation 
A -α αἵ (1.4.39) 


is called a conjugation operator. It is an involution in the sense that 
(τ SA. (1.4.40) 


Let us consider some special singular matrices. A matrix % such that 


12 = 4 (1.4.41) 


is called an idempotent. This equation is clearly satisfied by the unit matrix and 
by the zero matrix which are the trivial idempotents. In 9? 


1 0 0 0 1 1 
0 0] {o 1! Jo ὁ 


are examples of non-trivial idempotents. All idempotents, the unit matrix excepted, 
are singular. To an idempotent matrix corresponds an idempotent transformation, 
often called a projection. 
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The product of two matrices may very well be the zero matrix without either 


factor being zero. Since 
0 0; |0 ὁ 0 0 
1 of Jo 1] fo o 


this phenomenon can occur already in 3312. Such a matrix is called a divisor of zero. 
It is clearly singular. 
Another type of a singularity is shown by 


0 1 | 
#A=]0 0 1 
0 0 0 


Here A3 =. This is a nilpotent matrix. The lowest exponent k such that «* = 0 
is the degree of nilpotency of A. 

At the beginning of this section the question was raised if the sets €(C”) and Mi, 
have some structure, algebraic or geometric. We have endowed both spaces with 
algebraic structures. Can we also provide them with geometric structures, 
preferably metric ones? The answer is in the affirmative for both spaces. We 
postpone the discussion of &(C"), but observe that there are several ways of 
defining a norm and a distance in Wi,. 

We may regard Mi, as a linear vector space of n? dimensions over the complex 
field with addition and scalar multiplication defined by (1.4.9) and (1.4.10). In this 
context it is natural to endow 93M, with a Euclidean metric by setting 


n ἢ 1/2 
|All, = S, py nl] | (1.4.42) 


This is known as the Frobenius-Wedderburn norm [G. Frobenius (1849-1917) and 
J. H. Maclagan Wedderburn (1882-1948)] and is commonly used in matrix theory. 
It has the disadvantage of assigning the norm ./n to the unit matrix. Since this is 
undesirable, we shall use instead 


n 


|], = max > lagl, (1.4.43) 
j  k=1 
which gives |/&||, = 1. This notion of length will be used below and we define the 
distance between two matrices in Wt, as 
| — Bil, = le + (-- 19]} 0. (1.4.44) 


A third alternative is based on the definition of the norm in €(C”) which will be 
given later. See (1.6.23). This amounts to setting ||Al| = ||Allo. 


Lemma 1.4.3. The norm of the product of two matrices is at most equal to the 
product of the norms of the factors. 


Remark. This holds for all three norms proposed. We shall give the proof for the 
norm defined by (1.4.43). 
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Proof. We have 


n 


Σ, jm mk 


m= 


Ms 


< max [Dnt | 


J m 


|ADB|, = max δ᾽ [α jm| 
; ok=1 


Ἷ 
ΜΠ 1: 


< max Σ᾽ lanl (Bll, = Maes Bl, 
} πιΞ 


as asserted. Ε 


The existence of alternate definitions of the distance between matrices raises 
the question of equivalence of the definitions in the case of convergence. Suppose 
we have an infinite set of matrices 


Kin = (Qik), N= 0.1 2 sis 
What should be meant by the statement 
lim £#,, = 49? (1.4.45) 


We encountered a similar problem for vectors in C” (see Problem 9, Exercise 1.2). 
There we introduced convergence in the sense of the metric of the space but found 
that this type of convergence is equivalent to convergence for each coordinate 
separately. The same situation holds here. We have then at least four possibilities 
of defining convergence for matrices. One would be 

lim @=a,, jk =1,2,3,...,n, (1.4.46) 


m- οῸ 


the others are of the type 
lim |A,, — “%o{| = 0, (1.4.47) 


m-> co 


where for the norm we can take that defined by (1.4.42) or (1.4.43) or (1.6.20). 
Fortunately all four possibilities are equivalent. 


Lemma 1.4.4. Formula (1.4.47) is valid for one of the stated norms iff (1.4.46) 
holds and then (1.4.47) holds also for the other norms. 


Proof. Using the symbol = to denote implication, we shall prove (1.4.46) > 
(1.4.47) => (1.4.46). 


I. Suppose that (1.4.46) holds and consider the norm defined by (1.4.43). 
Then, as m > οὐ, 


|n -- σέο} = max ΕΝ [αὐ — ay°| > 0. (1.4.48) 
ἊΣ Ξ 


Similarly, πῶς τὲ 1/2 
4m — oll =| Σ lak - WP} “0. 


We omit the operator norm, which has not been defined at this stage. 
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II. Suppose that ||A,, — “oll; > 0. Then (1.4.48) shows that this implies 


(1.4.46). ἢ 


EXERCISE 1.4 


aI Nn Ὁ ὦ N 


10. 


11. 


12. 


13. 


14. 


15. 


. C" is a linear vector space over the complex numbers and vectors may be added and 


multiplied by complex numbers. Make a list of the properties of addition and scalar 
multiplication which you have encountered so far. In particular, what types of 
associativity, commutativity, and distributivity have you noticed? 


. Make a similar list of properties of an algebra using ©(C") as a concrete illustration. 
. Verify (1.4.12) and (1.4.13). 

. Verify (1.4.14). 

. Verify (1.4.15). 

. Find two operators A and B in ((( 2) which do not commute. 


. Show that a polynomial in a matrix A, say 


Ρ 
ΡΟΌ - αρδ-Ὁ Yat, a;EC, 
j=1 


commutes with any other polynomial in A. 


. A finite set of matrices {A pi =1, 2, ρει 4;€ M,,} is linearly independent over 


the complex field if 
Ρ 
j=1 


implies cy = ὁ, = --- = ὦ» = 0. Show that p cannot exceed n*. 


. Show that this implies that the matrix A satisfies an algebraic equation with 


coefficients in C. 


Write down the equation satisfied by (1) an idempotent, (2) a nilpotent. Note that 
both equations are without constant term. 


If A and & are regular, show that so is AS and find its inverse. Conversely, if 
C = AS is regular, show that 4 and & must be regular. 


Use this to show that an algebraic equation satisfied by a singular matrix cannot 
involve a constant term. 


If the equation P(x) = 0 with P(0) # 0 is satisfied by a regular matrix, find the 
equation satisfied by its inverse. 


Let & jx be the matrix with a one in the place (j, k) and a zero everywhere else. Show 
that the set {6 ,; j,k =1, 2, ...,n} is a basis for MN, regarded as a linear vector space 


of dimension n7?. 


Take n = 3 and let δι, be defined as in the preceding problem. Find all matrices in 
IM, which commute with δ... 
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16. Prove that the degree of nilpotency of a matrix in Nt, cannot exceed n. Can the 
degree n be reached? 

17. Show that the unit elements of €(C”) and Mt, are unique. 

18. If x and y are vectors in C”, find a matrix in IM, such that y = A-x. Is the 
solution unique? Does the problem always have a solution? 

19. A diagonal matrix is one in which all elements off the main diagonal are zero, i.e. 
J # k implies a jk = 9. When is such a matrix singular? Show that the set of all 
diagonal matrices in Mt,, forms a commutative subalgebra. 

20. Characterize the idempotents of this subalgebra and determine the divisors of zero. 
Are there any non-trivial nilpotents? 

21. Prove that an idempotent in Nt, which is regular must coincide with the unit matrix 
and hence all other idempotents are singular. 

22. If A and “, are similar matrices show that det () and det (,) have the same 
numerical value. Is the converse true? 
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The linear transformation A εἰ €(C") whose representation in terms of a chosen 
basis {u,} is the matrix maps C" into the linear subspace R[A] of C" made up of 
all vectors of the form 

y=A-x. (1.5.1) 


The zero element is left invariant by 4. Are there other invariant elements? As a 
rule not, but if we relax the demands and require only that 


AX = ax (1.5.2) 


for some constant « and some vector x with [[Χ] =1, then a more favorable 
situation arises. Here (1.5.2) may be written 


(a5 — #)-x = 0. (1.5.3) 
Equivalently we require («J — A)x =0. Thus we find that the matrix «& — σἔ is 


required to annihilate a non-zero vector x and this can happen iff the matrix is 
singular. Now if αδ — # is singular, then its determinant is zero, 


det (a6 — A) = 0. (1.5.4) 
Consider the equation in A 
A(A) = det (ξ — A) = 0, (1.5.5) 
or, in more detail, 
λ-- ayy — Qy12 Qin 
A(A) hy See ~ Gant _ 9), (1.5.6) 


eens ereen eee ee eeeeee sean eerees eee asnenaenee 
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Expansion gives an algebraic equation in ὁ of degree n which has a total number of 
n roots in the complex field. Suppose that there are p distinct roots 4,, 4,,...,2 
and that the multiplicity of A; is Κι. Here 


Dp 


Dp 
p<n, > A =n, (1.5.7) 
j=1 
and we have - 
A(A) = I] (A -- Ay. (1.5.8) 
j=l 


It follows that the number « in (1.5.2) must be one of the roots Δ). Here (1.5.6) is 
known as the characteristic equation of the matrix A. Incidentally, as will be shown 
later, it is also satisfied by & itself, i.e. A(4) = 0. Note that we are concerned with 
matrices and vectors over the complex field. Over the real field and for even 
(1.5.6) may have no roots at all. 

The roots are known under many names: characteristic roots or values, 
eigen values, latent roots are the most common. The set of roots for which (1.5.3) 
has a non-trivial solution is called the spectrum of A and will be denoted by o(A) 
in the following. If operators are held in the foreground, then o(A) will be 
corresponding notation. 

If A has a value distinct from all the roots, then A6 — # 15 regular, 1.6. has an 

inverse and we set 
| (AE — A) = RA, A), (1.5.9) 


known as the resolvent of A. We know how to form the inverse of a regular matrix, 
see formula (1.4.25). Applied to the present case this gives as the element of the 
resolvent in the place (j, k) 


Ay (A) _ 
Aid) = r,(A), (1.5.10) 


where A,,(4) is the cofactor of Ad;, — aj, in the determinant A(/). Thus A,;(A) is a 
polynomial in 2 with complex coefficients and of degree « ἢ — 1. It follows that 
r,(A) is a rational function of A with poles included in the set of zeros of A(A) and 
rj(A) + 0 as 4 co. Hence 
S(A, A) 
A(A) ᾿ 
where S(A, A) is a polynomial in A of degree < n — 1 whose coefficients are constant 
matrices in Me,,. 
It was stated above that the poles of r,,(A) are included in the set of zeros of 


A(A). That no stronger statement can be made is illustrated by the case of a 
diagonal matrix A with a,; = a;, a, = Oifj #k. If the a; are distinct, then 


r(a)=(A-—a)', ra(A=0, εκ. (1.5.12) 


Here each diagonal element has a single simple pole and the non-diagonal elements 
have none. 


RA, A) = (1.5.11) 
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Formula (1.5.11) shows that R(A, A) is a rational function of 2 with matrix 
coefficients. Note that the matrices enter only in the numerator of this rational 
function. The denominator is a scalar polynomial. Since each r,(A) > 0 as 
λ--» co, R(A, A) has the same property. Here we can apply classical results on scalar 
rational functions to show that each r,,(A) has an expansion in partial fractions of 
the form 


γκ(λ) = Σ Σ, ἄκιμι (A -- Ag)! (1.5.13) 


Here the «,,;,, are complex numbers, the 4,,’s run through the p distinct roots of 
A(A) and /is a positive integer taking on values from 1 to k,,, the multiplicity of the 
root A,,. 

Passing from elements to matrices we obtain an expansion 


RA, A) = VV A= Ay) Cis (1.5.14) 
lL om 


where now the Οὐ, are constant matrices in 9t,. These matrices have very special 
properties. They commute with each other and with A. They are all singular 
matrices, divisors of zero, idempotent for /=1, nilpotent for />1. Moreover, 
matrices C,,, with different second subscript are orthogonal to each other, i.e. their 
product is the zero matrix. 

It is possible to prove these properties by fairly simple means. Let us take the 
expansion of R(A, €) in the neighborhood of one of the zeros of A(A), say A = a, 
to simplify the notation. We have then 

Ο, C, 


RA, A) = τ a τ. 


Se =) - + RA). (1.5.15) 


(A — a) 
Here s is at most equal to the multiplicity of the root « and B(A) is a power series 
BA) = > BCA — a)*, (1.5.16) 
k=0 


convergent in norm for |i — αἰ <p, the distance from « to the rest of the 
spectrum. The 3’s and C’s are constant matrices, elements of Mt,. To see that the 
remainder B(A) in (1.5.15) really has a representation of type (1.5.16), we revert to 
(1.5.13). Here we picked out the terms where /,, = «. The rest should be 8(A), i.e. 


B(A) = 2 Σ (λ a dae Cine (1.5.17) 


where the prime after the summation sign indicates that terms with J,, = « are to be 
omitted. Now if 1, = B # a 


λ -- =| 


(A Ay'= [a B+ - τι τ α -- ptf -F—" 


= Sep (7) ὦ - ata - af 
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a binomial series in A -- a convergent for [1 — «| < [β — αἱ which is at least p. 
We repeat this for each β ¥ a in the spectrum and add the results. It follows that 
for |A — αἰ < p the rational function B(A) is the sum of a finite number of matrices 
Cim each multiplied by an absolutely convergent binomial series. Collecting terms, 
we obtain a series of type (1.5.16). 

Thus the representation (1.5.15) with B(A) given by (1.5.16) is valid in the 
punctured disk 0 « [Δ — αἱ < p and in this domain 


(AE — A) RA, A) = RA, A) (AE — A) = δ. (1.5.18) 
Here we substitute 
(A — a)& + (a6 — A) 


for λδ — A and the expansion (1.5.15) for RUA, A). We multiply out, collect terms, 
and note that the result should be identically δ. This gives the conditions that the 
coefficient matrices must satisfy. To force the negative powers of 2 — α to drop out 
we must have 

C, (A — a&) = @, 

C, ἘΞ C,-4 (A τς αδ), 


(1.5.19) 


as well as the relations obtained by permuting the factors in the products. The 
constant term gives 
C, = Bo (A — αδ) + ὃ (1.5.20) 


and the positive powers will drop out provided for each k > 0 


B,-1 = B, (A — αδ). (1.5.21) 
Again the factors commute. 
The next step is to examine B(A). The last two relations imply that 


(AE — A) BUA) = BCA) (AE — A) = —Bo(A — αδ) = Go. (1.5.22) 
Hence, for A not in the spectrum of A, 
BA) = δο RUA, A) = RUA, A) So. (1.5.23) 


Here we compare the expansions for B(A) and for R(A, €) with the relations implied 
by (1.5.23). It is clear that the matrix S) must annihilate all the negative powers in 
(1.5.15) and, since this must hold identically in A, the coefficient matrices must be 
annihilated, 1.6. 

Ο, 90 =C,9) => = CF, = 0, (1.5.24) 


and here we can also permute the factors. 
We now return to (1.5.20), which may be written 


C, =&—-To. 
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Multiplying both sides by C, and using (1.5.24) we get 


1.6. an idempotent as asserted. Recall that (1.5.15) is the expansion of R(A, A) in 
the neighborhood of A = a, one of the zeros of A(A). If « is actually in the spectrum, 
then S cannot be the zero matrix. For if f = C, = Ὁ, then by (1.5.19) Ὁ, --- 6, = 
+ = C, = Oand R(a, A) = B(a) would exist, contradicting the fact that « is in the 
spectrum. J may be the unit matrix but only in an exceptional case, for which see 
below. 


We now set 
C, =a (1.5.25) 
and find that 
SO = O7-=Q, (1.5.26) 
of interest only if Q is not the zero matrix. Equations (1.5.19) now give 
C07: j=2,3,...,8 (1.5.27) 
and 
Ora, (1.5.28) 


so that Q is nilpotent as asserted above. 
From (1.5.23) we obtain 


BA) = 90 BY) = BA). (1.5.29) 
Since Sp 7 = TT, =O, we get 
4“ BA) = BOA)S = O, QB(A) = BOA)A=0. (1.5.30) 


We have now obtained the principal part of R(A, A) at the spectral singularity 
A = a and the same type of expansion holds at the other points of the spectrum. 
Returning to (1.5.17) and using (1.5.30), we see that f and Ὁ annihilate all the 
matrices C,,, with m such that /,, # «. Now C€,,, is the idempotent associated with 
the spectral value 4,,. It follows that all these idempotents are annihilated by S and 
hence also by Q. In other words, the idempotents associated with distinct spectral 
values of A are mutually orthogonal. We denote now the idempotent corresponding 
to A= 4,, by J,, and the nilpotent by Q,,.. We have then proved 


Theorem 1.5.1. The resolvent of the matrix # has a partial fraction expansion 
of the form 


RUA) = ¥ 


m=1 


φ Q Q, ὄντ 
me (1.5.31) 


i- 1, Gay tae 
Here the summation extends over the p distinct characteristic values of & and 
| peo (1.5.32) 
f,Q,= a9, = 0, Q0,=0,0,=0, jHKk. (1.5.33) 
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Further, s,, is the least integer k such that 
Q,* = 0 (1.5.34) 
and S,, is at most equal to the multiplicity k,, of 2,5 a root of A(A) = 0. Finally, 


ῶ,, = F,(A — λ,8). (1.5.35) 


There are still various issues to be faced. First, can S,, be the unit matrix? 
Since the idempotents are mutually orthogonal and ὃ is orthogonal only to @, there 
is a single spectral value, say λ = a, and 


RA, A) = (A-—a) δὃ- GU - α)729 - ---..-.ὄ A-—a) "8:1, (1.5.36) 
so that 
A=a6+Q 


and A differs from «& only by a nilpotent. The conclusion also follows from 
(1.5.35). 

A second question is: Can s,, be less than the multiplicity of 4,, as a root of 
Δ(λ) = 0? An affirmative answer to this question can be read off from (1.5.12). 
This formula is valid whether or not the a,’s are distinct. Suppose that a; = 0 with 
multiplicity k and a; = 1 with multiplicity n — k. There are only two characteristic 
values, namely 0 and 1 of stated multiplicities. Here σέ itself is idempotent and the 
expansion of its resolvent is simply 


RA, A) = ANB — A) + (Δ -- 1η- 1.4. (1.5.37) 


Thus the resolvent has only two simple poles. If k>1, n—k>1 we have 
8; <k,,j =1,2. 

Our third question is: Can it happen that a zero of A(A) is not a pole of 
R(A, #4)? This presupposes that the corresponding idempotent is the zero matrix. 
Suppose that this takes place at A = «. Note that αδ — & is definitely a singular 
matrix since Δία) = 0. This means that R(/, A) does not exist for A = a, so this 
point must be a singularity of R(A, 4), a function of theoretical singularity. Now 
formula (1.5.11) shows that R(/, ) is a rational function of A, the quotient of a 
matrix polynomial of degree < ἢ — 1 in λ and a scalar polynomial in 4 of degree ἡ. 
Such a function can have no singularities save for poles in the complex A-plane. 
Hence 4 = «is a pole and the idempotent cannot be the zero matrix. Thus the third 
question is answered in the negative. 

For the further development we need 


Lemma 1.5.1. For large values of 1 
RA, A) = ANE HA HMA +A TARE ρῶς (1.5.38) 


In particular, the series converges and the expansion is valid for [1] > \|A\). 
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Proof. By Lemma 1.4.3 we have 
AE] < Al", (1.5.39) 


valid for the norm (1.4.42) as well as for (1.4.43). This proves the assertion concern- 
ing convergence of the series for |A| > ||A||. For such values we can multiply the 
convergent series left or right by A& — # and collect terms. They all cancel except 
for the constant term, which is & as it should be. ἢ 


The reader should note that the lemma does not assert that the radius of 
convergence of the series is |||], an expression which has at least two different 
meanings. Actually the series converges for 


|A| > max |A,|, (1.5.40) 
j 


as we can deduce from (1.5.31), and this is always the exact radius of convergence. 
Among the consequences of Lemma 1.5.1 we list 


Lemma 1.5.2. If & has p distinct characteristic roots and if S,,3,,...,5, are 
the corresponding idempotents, then 
9,49, 4-45, Ξ δ. (1.5.41) 


Proof. We compare formulas (1.5.31) and (1.5.38), both valid for large values of 
|A|. We can expand the partial fractions in descending powers of 4. Only the first 
order fractions can give a term in 1/A and it will have the coefficient f,,. Adding 
terms and comparing the result with (1.5.38) we obtain (1.5.41). Jj 


We call this formula a resolution of the identity. It is a decomposition of the unit 
matrix into a sum of mutually orthogonal idempotents belonging to the matrix A. 

We shall use this result to prove the Cayley—Hamilton Theorem [after Arthur 
Cayley (1821-95) and Sir William Rowen Hamilton (1805-65) ]. With the notation 
of Theorem 1.5.1 set 


H(A) = I (A — A). (1.5.42) 


This polynomial in 4 is a divisor of A(A) and will coincide with the latter iff each 
δ) = k,;, the multiplicity of 4; as a characteristic root. We now form 


(A) = ΠΩ - 2)" (1.5.43) 


and note that the factors on the right commute. The Cayley-Hamilton theorem now 
reads: 


Theorem 1.5.2. S(A) is the zero matrix. 


Proof. Lemma 1.5.3 gives 


H(A) = H(ALS, + ὅ5 + + 4.1. (1.5.44) 
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Here : 
H(A)S,, = ΠΠ 4 -- A8)S,, = A, (A) ΟἹ -- Ano) "Ss 
j=1 


where va 
H,,(A) = []Π (4 -- 18)". 
j#m 
We now refer to formula (1.5.35), according to which 
(A — 1,8)", = 2,5" = Ὁ 
Hence each of the p products in (1.5.44) is zero and so is their sum. i 


Thus & satisfies 
H(A) = 0, (1.5.45) 


known as the minimal equation. ‘‘Satisfies’’ means here that if A is replaced by A and 
1 by & in the polynomial H(A), the result is the zero matrix. It is clear that A also 
satisfies the characteristic equation. The term minimal refers to the fact that if A 
satisfies an algebraic equation G(A) = 0 in the sense just mentioned, then G(A) 
must be divisible by H(A). See Problems 2 (Exercise 1.5) and 3 and 4 
(Exercise 1.6). 
Let us return for a moment to formula (1.5.31), which expresses R(A, A) as the 
sum of terms of the form 
BA — αὐ", (1.5.46) 
where & is a matrix and the other factor is a rational function of Δ. Each of these 
factors has derivatives of all orders with respect to A. From this we may conclude 


that R(A, A) also has derivatives of all orders with respect to 4. In particular we 
can find Κ΄ (λ, A) by termwise differentiation in formula (1.5.31). 


EXERCISE 1.5 


1. If 6 jj 18 the matrix with a 1 in the place (j, 7) and zeros elsewhere and if n > 2 find 
the characteristic and the minimal equations of &,,. 

2. Let G(A) be a scalar polynomial in A with no root in common with the minimal 
equation H(A) = 0 satisfied by a given matrix “A. Show that G(A) is a regular matrix 
and hence not zero. 


3. Let G(A) be a scalar polynomial in 7 of degree m < n. How is the spectrum of G(A) 
related to that of A? 


4. Assuming the spectral values of # to be distinct, find RU, A’). 


5. If a matrix A satisfies the equation A? — 4A = 0, find the spectrum and the 
resolvent. 


6. Show that formula (1.5.37) is valid for any idempotent matrix A and not merely for 
diagonal idempotent matrices. 
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7. What is the minimal equation of a nilpotent element? The characteristic equation? 


8. Assuming & to be a non-singular diagonal matrix, in how many ways can a diagonal 
matrix B be found such that B* = A? 


9. Show that the characteristic roots of a Hermitian matrix are real. 


10. Show that the spectrum of a unitary matrix (see Problem 4, Exercise 1.3) lies on the 
unit circle in the complex plane and is symmetric with respect to the real axis. 


11. What is the inverse of a unitary matrix? Show that the minimal equation has real 


coefficients and is a so-called reciprocal equation, i.e. if « is a root, so is a. 


12. What is the relation between the spectrum of A and that of its conjugate transpose 
A*? 


13. If A and p are not in the spectrum of A, show that 
(A — p) RU, A) R(p, A) = Ry, A) — RA, A). 
This functional equation is known as the first resolvent equation. 


14. At the end of Section 1.5 it was shown that R(A, A) has derivatives of all orders with 
respect to A. Use the result of the preceding problem to show that Α΄ (λ8, A) = 
—[R(A, 9132 and obtain from this ROU, A = (—1)"m![RU, Ay"t!, Vm. 


1.6 INVARIANTIVE AND METRIC PROPERTIES 


Let us write C” = X¥ to abbreviate and consider a linear transformation A on X to 
itself which in terms of a given ordered basis F = (uy, ...,u,,) is represented by the 
matrix of Wt,. Let the distinct characteristic values of A be A,, 12, ..., 4,, where the 


multiplicity of A; is k;. As usual, δ) shall be the idempotent, Ὁ; the nilpotent cor- 
responding to 4;. Further, 5; is the least integer such that 


Qi = 6. (1.6.1) 
We know that 5; < Κι. 
In this section we shall consider two disconnected topics. The first is the 
invariant subspaces connected with the transformation A. The second is concerned 
with norm and distance for the spaces X and €(X) = €(C"). 
The matrix J; defines a projection of X onto a linear subspace X, < X. The 
subspaces X; and X,;, i # j, have only the zero element in common since the cor- 
responding matrices S; and J, are orthogonal to each other. Hence any element 


x eX, x #0, has a unique decomposition obtained by applying (1.5.41), 
X=X,+X,+--4+X,, x; E X,. (1.6.2) 
Here X, is known as the jth root space of X and it is left pointwise invariant under S - 
S(x)=x if xeX,. (1.6.3) 


Note also that S; maps X,, Καὶ #j, into the zero vector. 
Some vectors of X¥ are annihilated by the matrix A — 4,6. We start with vectors 
in X;. As we shall see, no vector in X,, x # 0, can have this property fora k # j. 
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Suppose that x is a characteristic vector, i.e. 
(A — 16)x = 0, x # 0, xe X,. (1.6.4) 


The set of such vectors along with the zero vector is a linear subspace =, of X,, 
known as the characteristic or eigen space corresponding to A; Here =, may 
coincide with X;. If it is a proper subspace, however, then there are vectors in 
X; © =, which are annihilated by some power of # — 4,6 higher than the first. 
In fact, all of X,; is annihilated by the s;th power, Le. 


(4 — 16)? x = 0, xe X,. (1.6.5) 
This follows from the definitions of Ὁ; and of s;. We have 
Q, = (A — 1) 5, 
so that on X, the matrices A — 1,6 and Q, exert the same action. Hence x € x; 


implies that 
(A — 1,6)x = OK = 0. (1.6.6) 


In particular, this argument shows that =; cannot reduce to the zero vector. It was 
remarked above that no vector in X,, different from the zero vector, can be 
annihilated by A — 1,6 for ἃ 7 #k. This will now be proved. 

Suppose that xe X,, x 4 0, 7# k, and υἱχ = A;x. Then 


(A — 1,6)x = (A; — λὼχ 
and, generally, 
(A — AE)" x = (A; — A) (A -- λιδ)" 1 x (1.6.7) 


for any positive integer m. Since x € X,, there exists an m such that the left member 
of (1.6.7) is zero. Since j ¥ k, this implies that (4 — 1,6)"~! x is also zero and by 
complete induction that x = 0. The same argument shows that no power of 
A — 1 can annihilate an xeX,,x 40,74 k. 

We shall now discuss the dimensions of ©, and X;. These are linear vector 
spaces and the notion of dimension was defined in Section 1.2 as the largest number 
of linearly independent vectors belonging to the space in question. We shall need 
the following: 


Lemma 1.6.1. Similar matrices have identical characteristic equations and 
thus have the same characteristic values with the same multiplicities. 


Proof. Suppose that A, = BAL ', where & is a regular matrix. We recall that 


det (B) det (BH ἢ) = det (δ) = 1. 
It follows that 
det (λδ — A,) = det (AE — BAB~*) 
= det [BUE — A) Br 4] 
= det (B) det (λὲ — A) det (B~*) 
= det (AS — A) 
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as asserted. From the identity of the characteristic equations the other assertions 
follow. ἢ 


We denote the dimension of a linear vector space X) by dim (X¥,.) and prove 
Theorem 1.6.1. 1 < dim (Ξ)) < kj. 


Proof. It was noted above that there is at least one x with x # 0 belonging to Ξ;. 
Since all constant multiples of this vector also belong to &,, it follows that 
dim (&;) > 1. To prove that dim (=,) < k,, the multiplicity of A;, we argue as 
follows. Suppose that dim (&,) = γι. We can now find a new ordered basis for X, 
say Fy = (Υ,,...» Vins Vsti» «++» Vn), Such that the first m vectors belong to Ξ;. Now 
the action of transformation A on F, is represented by a matrix 4, similar to +. 
Here the first m columns of £, are made up of zeros except for the elements of the 
main diagonal, which are all equal to 4,. It follows that det (λδ — #,) has (A — A,)™ 
as a factor so that 1 = A, is a characteristic root of 4, of multiplicity >m. But by 
Lemma 1.6.1, 4 and #,, being similar, have the same characteristics value with the 
same multiplicities. Thus m < k, as asserted. a 


We come now to the question of the dimension of X,;. We shall prove that 
dim (X;) = k,. For this purpose we need 


Lemma 1.6.2. If the matrix & has 4 = 0 as a characteristic value of multiplicity 
k, then the same is true for 8", where r is any positive integer. 


Proof. Let ὦ be a primitive rth root of unity and consider 
x8 -- BF = Π (λωΐδξ -- 8). (1.6.8) 
jz 
Hence, if one of the factors on the right is singular, so is the product. Now 
det (18 — 97) = Π det (λωΐξ — 9). (1.6.9) 
js 


Here the first factor on the right with j = 0 is divisible by 4* and by no higher 
power of 4. The various factors on the right are simply permuted under the 
mapping A — ὧλ. It follows that each factor is divisible by 1" and by no higher 
power of 4. This shows that the left member is divisible by Δ" and 


det (a6 — 8) 
is divisible by αὐ and by no higher power. Hence « = Ois a characteristic root of 8" 
of multiplicity k. Jj 
We apply this lemma to the case 
B= A— AF, r= 5; 


. 
to get 
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Theorem 1.6.2. The dimension of the root space X, equals the multiplicity of 
the corresponding characteristic root 4,. 


Proof. We consider the matrix (A — A,&)*/. It has the characteristic value 0 with 
the multiplicity k; by Lemma 1.6.2 since 4 — 1,§ has this property. We now appeal 
to (1.6.6) which says that all of 3; is annihilated by (4 — 1,6). Moreover, it was 
shown that a vector x, x # 0, is annihilated by this matrix iff x e X;. This shows 
that (% — 1,8)! admits X,; as the characteristic space for the characteristic value 
zero. By Theorem 1.6.1 the dimension of a characteristic space is at most equal to 
the multiplicity of the corresponding characteristic value. In the present case this 
gives dim (X;) <k,;. Hence, for j =1, 2, ..., p, 


dim (%,) < k,. (1.6.10) 
Since ¥;, ὦ X; = {0} if i 4 j, 


J 


Dp 
dim (X,) = dim (X) = 2. (1.6.11) 
=1 


On the other hand, the total number of characteristic values with attention paid to 
multiplicity is also ἢ so that 


k, =n. (1.6.12) 
j=i1 


Combining the last three relations we see that equality must hold for each j in 


(1.6.10). j 


The rest of this section will be occupied by questions concerning the metrics 
of C" and &(C"). Originally we had an orthonormal basis {u,}, and if 


x= δ᾽ x, (1.6.13) 
i=3 
we defined the Euclidean norm by 
n 1/2 
ini = {¥ bo} (1.6.14) 


This is a real number associated with the vector x having the following 
properties: 

(N,) {{π|} = 0 for all x; ||x|| =0 iffx = 0. 

(N2) [αΧ]] = la] Ix, ἃ ε C. 

(N3) [lx Ἐν] < [Χ]] + ΠΥ]. 


For the last property see Problem 1, Exercise 1.2. 
Once we had a norm in C” we could define a distance simply by setting 


d(x, y) = ||x — yl. (1.6.15) 
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This notion of distance satisfies the conditions: 


(D,) d(x, y) > 0 for all x,y; d(x, y) =0iffx =y. 
(D,) A(x, y) oe dy, x). 
(D3) d(x, y) < d(x,z) + d(z,y). 


The last condition is known as the triangle inequality for obvious reasons. See 
Problem 2, Exercise 1.2. 

In investigations of what he called the “geometry of numbers” the German 
mathematician Hermann Minkowski (1864-1909) found it desirable to introduce 
alternate metrics in Κ΄. In particular, he introduced what was later called the 
/,-metric. Here one sets 


n 1/p 
|x|, = » PM ; (1.6.16) 
= 
where p is any real number >1. He also considered the /,,-norm obtained by 
letting p > oo in (1.6.16) so that 
[xl]. = max |x;. (1.6.17) 
1<j<n 
The Euclidean norm is the case p = 2. 

In addition, Minkowski did two other noteworthy things. He formulated 
conditions (D,), (D2), (D3) as properties that a notion of distance ought to 
possess. As a matter of fact, he did not always insist on (D,), so that the distance 
from x to y could be different from the distance from y to x. 

He also introduced a class of functions from vectors to reals which could serve 
for defining a distance. In recent years such functionals have become known as 
semi-norms. The two basic properties of a semi-norm are subadditivity and 
homogeneity: 

p(x + y) < p(x) + ply), (1.6.18) 


p(ax) = ap(x). (1.6.19) 


Minkowski, who was concerned essentially only with R”, required the second 
property only for real positive values of «. The distance from x to y is then taken as 


d(x, y) = p(y — x) (1.6.20) 
and the distance from y to x as p(x — y). If xe R" and 
p(—x) = p(x), (1.6.21) 


then these two distances are equal, otherwise not. 

Minkowski verified that the functions defined by (1.6.16) and (1.6.17) satisfy 
the stated conditions. The homogeneity is obvious but the subadditivity requires 
an elaborate argument which becomes simple only for p =1, 2, and o. This 
discussion is postponed until Section 3.1. 
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In Exercise 1.3 the unit ball in C” was considered as well as its image under a 
linear transformation A. In these problems the underlying metric of C” was the 
Euclidean one, but the same problems are meaningful and significant for any p > 1. 
The unit ball changes its shape with p, as the reader will find by considering the 
situation in ΚΖ and Εὖ. 

In Problem 3, Exercise 1.3, the reader had to show an inequality which may be 
written in the form 

| AX|l2 < Alla [{Χ|}2: (1.6.22) 


where the Euclidean norm for the corresponding matrix A is defined by (1.4.42). 
This is usually a fairly crude estimate, but at least it shows that the norm of the 
linear transformation A when defined by 


|All, = sup {| Axll2; [[Χ]}} <1} (1.6.23) 


is a finite number depending only upon A. This is the norm of A as an element of 
the algebra €(C") in the /,-metric. Similarly, we can define an /,-norm 


|All, = sup {I AXxll,; xl, < 15 (1.6.24) 


which depends only upon A and p. In Section 1.4 it was mentioned that the norm 
of the linear transformation A could also be used as norm of the corresponding 
representative matrix A. All we have to do 1s to replace A by & in the last two 
formulas. 


EXERCISE 1.6 


1. Why is =, # {0}? 

2. Let X9 EX;, Xo # 0, and form x,, = (A — A j6)" Xo, m =1,2,...,k. If x, 4 0, 
show that the vectors Xp, x,, ..., X; are linearly independent vectors of X ᾿ 

3, It is desired to justify the name minimal equation given to (1.5.45). In Problem 2, 
Exercise 1.5, the reader proved that if G(A) and H(A) have no zeros in common, then 
G(A) is regular and hence cannot be the zero matrix. Suppose now that G(A) = 0 
has the root A = A, but with a lower multiplicity than for H(A) = 0. Show that 


there is a vector Xp, Xp # 0, Χρ Ε X,, such that G(A) Xp # 0, so that G() cannot be 
the zero matrix. 


4. Use the preceding result to show that G(A) must be divisible by H(A) for G(A) to be 
the zero matrix. 

5. If ||x||, is defined by (1.6.16) and (1.6.17) show that this functional is subadditive in 
the extreme cases p = 1 and p = oO. 

6. The shortest distance between two places A and B is in daily speech often given as so 
and so many hours’ ride. Leaving aside the lack of precision in this notion of 
distance, show that it satisfies (D,) and (D3). Is (D3) necessarily valid? 
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7. Using the /,-metric construct the “unit circle” in R?. How do the curves ||x|| p=l 
vary when p increases from p = 1 to p = ©? What points are common to all these 
curves? 

8. Consider R" and p = 1. Show that the unit ball is bounded by hyperplanes, 2” in 
number, and give their equations. Find the minimum of the Euclidean distance of a 
point P on the /,-unit sphere from the origin. Is the unit ball a convex set? 

9. The unit ball in any /,-metric is a convex subset of Ὁ", What properties of the metric 
do you need for the proof? (Generalization of Problem 3, Exercise 1.3.) 

10. Suppose that a functional p(x) on C” to reals satisfies the conditions (1.6.18) and 
(1.6.19). Suppose that p(x) > 0 and p(x) = 0 iff x = 0. Suppose also that 
p(—x) = p(x). Show that d(x, y) = p(x — y) defines a distance in C” satisfying 
conditions (D,), (D,), (D3). 


1.7 HERMITIAN FORMS 


In the study of the properties of linear transformations A € €(C") valuable infor- 
mation may be obtained from the inner product 


(Ax, x). (1.7.1) 


Let {u,} be an ordered orthonormal basis of C” in terms of which 
n 
x= Σ᾽ x,u,. 
j= 
There is a matrix 4 ε M, such that 


Αχ- σε χε ΙΣ ἄμα] υ;. 
7Ξ1 Lk=1 - 


We have then 


(Ax, x) = yy Σ Dy X4X;j5 (1.7.2) 
(x;Ax)= Σ᾽ DY GyX,x;. (1.7.3) 
j=1 k=1 


For the case of Κ΄ and a real symmetric matrix A, A’ = A, we have then 


(Ax, x) = (x, Ax) = Σ » DX jXks (1.7.4) 


where the last member is known as a quadratic form. Such forms arose at an early 
Stage in analytic geometry. In Κ΄ the locus 
3 3 


» ΕΝ Aj, XjX_ = C (1.7.5) 


γι 


is known as a central quadric surface, “‘central’’ because there is a center, in this 
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case the origin. An important problem in analytic geometry is the reduction of a 
general quadric to principal axes and to the corresponding normal form of the 
equation of the surface. This reduction involves a discussion of the characteristic 
roots of associated symmetric matrices and of certain minimax problems. These 
facts may serve as a motivation for the following discussion, which applies to the 
more general case where A = A* is a Hermitian matrix. In Chapter 11 we shall 
encounter extensions to Hilbert space. 

Besides quadratic and Hermitian forms we shall also encounter the associated 
bilinear polar forms 


(Ax, y) = Σ > Aj X4j> (1.7.6) 
j=l k=1 

(x, fy) = 2 Dy Gj, ΧΙ: (1.7.7) 
j= = 


The name polar form comes from analytic geometry where (Ax, y) = C is the 
polar plane of the quadric (Ax, x) = C with respect to the pole y. If y is on the 
quadric, the equation of the tangent plane at x = y is obtained. The following 
theorems play an important role for various uniqueness theorems. 


Theorem 1.7.1. A necessary and sufficient condition that A be the zero matrix 
is that (Ax, y) = 0 for all x and y in C”. 


Proof. The condition is obviously necessary. It is also sufficient for if (Ax, y) = 0 
for all x and y, we can take y = Ax and obtain (Ax, Ax) = 0 or Ax = 0 for all x. 
This forces σέ to be the zero matrix. fj 


For the next theorem we need an important identity which holds for all 
complex numbers s and ¢ and all vectors x and y: 


si(Ax, y) + 5t(Ay, x) = [A(sx 4+ ty), sx + ty] — [5 Οὐχ, x) — [t|?(ty, y). (1.7.8) 
The verification is left to the reader. 


Theorem 1.7.2. A necessary and sufficient condition that A be the zero matrix 
in Mt, is that (Ax, x) = 0 for all x in Ὁ". 


Proof. The necessity is again obvious. If the condition is satisfied, then the right 
hand side of (1.7.8) is zero for any choice of s, t, x, y. We leave x and y arbitrary 
and take first s = ¢ = 1, obtaining 


(Ax, y) + (Ay, x) = 0. 
Next we take s = i, t =1, the factor i cancels and we obtain 


(Ax, y) — (Ay, x) = 0. 
Addition gives 
(Ax, y) = 0 


for all x and y, and by Theorem 1.7.1 this implies that A is the zero matrix. ῇ 
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In these theorems we can replace matrices by linear transformations. We may 
also replace (Ax, y) by (x, Ay), etc. 
Suppose now that A e €(C"). We can then find a transformation Β ε €(C") 
such that 
(Ax, x) = (x, Bx) (1.7.9) 


for all x. There is at most one such transformation B, for if there were two, B, and 
B,, then 
[x, (B, — B,)x] =0 


for all x and by the alternative form of Theorem 1.7.2 this implies that B, — B, is 
the zero operator. To prove that there is at least one B we use the matrix 
formulation. If A = (a,), & = (by) are corresponding matrices, then (1.7.9) 
implies the identity 

n 


n n n 
2 24 Dp X4Xj = 4 2. D jp X4Xj 
4s i= 


72 
in the 2” variables x,, X2, ..., Xs X14, X25 -.+» X,. Hence 
bi, = ἄχ; (1.7.10) 
so that 

b= 4%, (1.7.11) 
As a consequence we write 

B= A* (1.7.12) 


and refer to A* as the adjoint transformation of A. If 
A = A* 


we say that A is self-adjoint or Hermitian. This is the case to be studied in the 
following. The corresponding form (1.7.2) is known as a Hermitian form and the 
matrix 15 also Hermitian. 


Theorem 1.7.3. The characteristic roots of a Hermitian matrix are real. 
Proof. Let A, be a characteristic root of a Hermitian matrix J€ and Xp, [Χο] = 1, 
an associated characteristic vector. Then 
(ICXo, Xo) = (λοχο; Xo) = λοίχο; Xo) = Ao. 
Since JC is Hermitian, the first member equals 
(Xo, IOxo) = (Xo, Xo), 
and hence real so that Ap is real. ἢ 


~ The corresponding Hermitian form is said to be positive definite, indefinite, or 
negative definite according as the characteristic roots of J are all positive, some 
positive and some negative, or all negative, respectively. 
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Theorem 1.7.4. Characteristic vectors corresponding to distinct characteristic 
roots of a Hermitian matrix are orthogonal. 


Proof. Suppose that 1, and 1, are distinct characteristic roots of the Hermitian 
matrix J€ and that x, and x, are corresponding normalized characteristic vectors. 
Since J, and A, are real, 


Ay (X1, X2) = (JOx,, Xp) = (K,, Oxy) = λχίχ,, Xp), 
so that (x,, x.) = 0 as asserted. §j 


Here we have used the 


Lemma 1.7.1. If K is Hermitian, then for all x and y 
(JCx, y) = (x, Jy). (1.7.13) 


The verification is left to the reader. 
Let us get some idea of what operations may be performed in the set of 
Hermitian matrices in 391... We state without proof 


Theorem 1.7.5, The sum of two Hermitian matrices is Hermitian. The scalar 
product of a real number and a Hermitian matrix is Hermitian. Any positive 
power of a Hermitian matrix is Hermitian but in general the product of two 
Hermitian matrices need not be Hermitian. 


The last point is illustrated by the example of 
1 2 2 4 2-21 4-| 
2 1 —i 2 4--ἰ 242i 


where the factors are Hermitian but the product is not. 

Another operation which preserves the Hermitian character of a matrix is 
transforming it by a unitary matrix. Unitary transformations U and unitary 
matrices “U were introduced in Exercise 1.3. We recall that distances and inner 
products are preserved under unitary transformations (z, w) = (Uz, Uw). More- 
over, a unitary transformation has an inverse. With w = UL” ‘z we get 


(z, 1. *z) = (Αἰ χ, WU 1) = (Ubz, z) = (z, UW* 2). 
This implies 
UW! = UW (1.7.14) 


by Theorem 1.7.2. Transforming a Hermitian matrix JC by a unitary matrix “UL we 
obtain a similar matrix J€,. Here 


(J€,)* = (UICEU~1)* = (U7 Ὁ» 905 UW* = UWIEW! = I,. 


Compare Problem 3, Exercise 1.7. Thus JC, is also Hermitian. It has the same 
characteristic values as JC. Actually this device may be used to construct a similar 
Hermitian matrix which is also diagonal. 
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Theorem 1.7.6. If H is a Hermitian linear transformation on C", there exists an 
ordered orthonormal basis {πρὶ such that 


Hu, = Au, a ey / 2 (1.7.15) 


where {A,\ is some arrangement of the characteristic roots of K, each root repeated 
as often as its multiplicity indicates. 


Proof. We use induction onn. The theorem is obviously true for n = 1. Suppose 
that it holds for nm = m and consider a Hermitian linear transformation on C”*?}. 
It has m + 1 characteristic values, all real. For simplicity, let 4, be the largest 
such value and let u, be the corresponding normalized characteristic vector so that 


Au, ΞΞ λιῃ,. 


The vectors in C”** which are orthogonal to u, form a linear subspace X, of 
dimension m. We note that H(X,) Ξ X,. For if x» eX, and x, τ 0, then, by 
(1.7.13), 

(HX9,U;) = (Xo, Hu,) = (Xo, 4,01) = Ay (Xo, u,) = 0, 


SO HX is also orthogonal to u, and hence confined to X,. We now restrict H to X,. 
It is still Hermitian (why?) and its characteristic roots are 1,43, ..., Am445 1.6. the 
characteristic values of H on C”*! except for 4, (why?). By the induction 
hypothesis we can find an orthonormal basis for X,, say 112, Us, ..., U,,4,, in terms 
of which 

Hu, = A,u,, k =2,3,....m +1. 


Adjoining u, to this system, we obtain an orthonormal basis for C"*! in terms of 
which (1.7.15) holds. Hence the theorem holds for all 7. Jj 


Theorem 1.7.7. For a Hermitian matrix each root space coincides with the 
corresponding characteristic space. 


Proof. ‘Take a characteristic value A) of multiplicity k and let X¥) and =, be the 
corresponding root space and characteristic space. Consider the orthonormal basis 
defined in the preceding theorem. To simplify the notation we assume that A; = Ao 
for j =1,2,...,k. Consider the corresponding elements of the basis u,, uy, ..., U,. 
They are all annihilated by the first power of (H — AI) so they belong to =, and 
hence also to Xo. These elements are linearly independent since they are mutually 
orthogonal by construction. Hence dim(=,)>k and by Theorem 1.6.2 
dim (X9) =k. It follows that dim (Ey) = dim (39) = k and since Ey | 30 we 
must have Ey = Xp as asserted. Jj 


Corollary 1. For a Hermitian matrix the associated nilpotents reduce to the 
zero matrix. 


Proof. In terms of linear transformations rather than matrices we have 


QO; = (ἢ — Al) P,. 
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Here P, leaves the jth root space invariant and annihilates the rest of C". Since the 
root space is also the characteristic space, it is annihilated by H — AI. It follows 
that Q; annihilates all of C” and is hence the zero operator. fj 


Corollary 2. The Hermitian matrix 3 is similar to the diagonal matrix (A, 5;,). 
Proof. Interpret (1.7.15). Jj 


Corollary 3, If is Hermitian and the similar matrix KH, = AKA! is also 
Hermitian, then & is unitary. 


Proof. If 
(AKA~*)* = (A71)* * A* = ARAA!, 


then A~* = A* and & is unitary. ἢ 


Thus the Hermitian character of a matrix is preserved under a similarity trans- 
formation iff the transforming matrix is unitary. 

The rest of this section is devoted to the relations between the Hermitian form 
(Hx, x) and the spectrum of H. Consider the orthonormal basis {u,} defined in 
Theorem 1.7.6. Suppose that 


X = CU, + CU, + ++ + CU, Cc; = (xX, πα). (1.7.16) 
Then 
HX = λίοχα, + λχοχὰχ + «+ + ACU, (1.7.17) 
so that 
(Hx, x) = Ayle,|? + λ2{π22 + --- + A,le,|?. (1.7.18) 
We have 
xl]? = lef? + lel? + + + ]ς,}2. (1.7.19) 


Suppose now that the characteristic values are numbered so that 


Ay > Ay zt = A, 
We have then 
(Hx, x) < A, ||x||*. (1.7.20) 


If all the A’s are equal to 1,, we have equality in (1.7.20). If 2, > A, we have 
equality iff |c,] = 1 and c,=¢c3; =... =c, =0. Since (Hx, x) 15 continuous and 
bounded on the closed and bounded unit sphere ||x|| = 1, (Hx, x) takes its maximum 
value at some point. This gives 


Theorem 1.7.8. The maximum value of the Hermitian form (Hx, x) on the unit 
sphere ||x|| = 1 equals A,, the numerically largest of the characteristic values. 


The other characteristic values are also obtainable as solutions of maxima- 
minima problems, in this case with side conditions. 
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Theorem 1.7.9. The maximum value of (Hx, x) is 4,, if x is restricted to the set 
of vectors which satisfy 


|x|| = 1, (x, u,) = (Χ, 2) = + = (x, u_1) = 0. (1.7.21) 


Proof. We have ἐς =0 for! <k <j—1. Hence (1.7.18) reduces to 

(Hx, x) = λί οὖ + Aya rlejaal? + +++ + Anlenl” < Aj. 
If 24; >Aj41, we have (Hx,x) = 4,;, iff |(x,u,)| =1 and (x,u,)=0 for k #/. 
If A; = Aj+,, the maximum is still 2;, but the maximum is reached on a larger subset 
of the sphere. Jf 


Thus (Hx,x) takes on all spectral values of H for unit vectors. We call 
attention to the fact that intermediary values are also assumed (why?) so that 
the range of (Hx,x) is the closed interval bounded by the least and the largest 
characteristic roots. 

The proofs of the extremal properties of the characteristic values of a 
Hermitian linear transformation given above presuppose a knowledge of these 
values and corresponding vectors. Actually it is possible to formulate a minimax 
problem which leads directly to the jth characteristic value. 


Theorem 1.7.10. Consider any set of j —1 linearly independent vectors 
V1, Vo, ---5Vj-y in Οὐ, Let S be the set of all vectors x such that 


[Χ] =1, (χ,ν}) = (% V2) = --- = (Χ,ν;- 1) = 9, (1.7.22) 


and let 4(S) denote the maximum of (Hx, x) for x in S. Then for any admissible 
set S we have 2(S) > 4,. 


Proof. We know the existence of a set {v,} for which A(S) = 4;, namely the vectors 
U,,U2,...,U,_, defined by (1.7.15). We shall prove that 
min max (Hx, x) = 4,, (1.7.23) 
S xeS 
which is a stronger assertion than that made in the theorem. To show this, it is 
enough to show that any admissible set S contains an element x for which 
(Hx, x) > 4,. Since S is a closed bounded set max (Hx, x) exists and will not be less 


xeS 
than 1; for this value is reached for some choice of x. From A(S) 2 4; and Theorem 
1.7.9, which asserts that the infimum of A(S) is assumed for at least one choice of 5, 
formula (1.7.23) would follow. 
Now take a fixed set of j — 1 linearly independent vectors {v,} and consider 
all vectors of the form 
X = OU, + OU, ++: + OU, (1.7.24) 


where the u’s are defined as in Theorem 1.7.6 and the «’s are complex numbers. 
It is possible to choose these numbers so that xe S. The conditions (x, v,) = 0. 
k =1,2,...,7 —1, lead to the system 


04 (Wy, V,) + %2 (U2, Vy) + + aj(u;, Vv.) = 0, 
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where k = 1,2,...,7—1. This is a set of j — 1 linear homogeneous equations in 
7 unknowns and as such always has a non-trivial solution. Thus, for any S there is a 
vector of the form (1.7.24) which is contained in S after normalization. We have 
now 

(Hx, x) = Ayloy|? + λχία 2 + --- + Ajai? > 2, 


J 
if, as we may assume, J, > 1, >... > 1,, and we use that > |a,|7 =1. This proves 
the assertion. ἢ Ἀπ} 


Since we are dealing with finite forms (Ηχ, χ), bounded and continuous on 
|x|] = 1, it makes sense to speak of maxima and minima. We could, however, just 
as well consider suprema and infima in (1.7.23). This formulation of the problem is 
easier to carry over to infinite-dimensional situations. 


EXERCISE 1.7 


1. Verify (1.7.8). 
. Prove Lemma 1.7.1. 


. Prove that (AB)* = B*A*. 


Ὁ WwW WN 


. Let A be a real symmetric matrix, A € Mt, and consider the quadratic form (x, x). 
Prove the analogues of Theorems 1.7.8 and 1.7.9 by the methods of the Calculus 
(e.g. by Lagrange’s methods of multipliers). Hint: In the case of the absolute 
maximum consider 


n n n 
G(x) = a Σ ax xe -- Δ }Σ feed. 
j=1 k=1 j=1 

5. If A is a linear transformation, A € G(C"], show the existence of a unique decom- 

position A = H + iK, where H and K are Hermitian. 
6. If H and K are Hermitian and a is a real number, show that H + K and ΔΗ͂ are 

Hermitian. 
7. Show that HK is Hermitian iff H and K commute. 


8. If H is Hermitian, show that H” is Hermitian for any positive integer p. 
The preceding three problems cover Theorem 1.7.5. 


9. How are the spectra of the Hermitian operators H and H? related? 


10. If the Hermitian operator H is positive definite, show that among the various square 
roots of H there is one which is positive definite Hermitian. T is a square root of H 
if T7(x) = H(x), Vx. 


11. If H is Hermitian, show that its Euclidean operator norm ||| = max || Ax|| 
equals max |/,|. |{Χ|{Ξ1 
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12. 


13. 


14. 


15. 


16. 


17. 
18. 
19. 
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Define exp (H) = ΤΈΣ 1/m! H™. Show that the series converges in norm, 
i.e. the sum of the norms is convergent. Is this an element of &(C"), and if so, why? 
Show that exp (H) is Hermitian, if H has this property. Show that the spectrum of 
exp (H) is the set {exp (A,)}. 

Show that if H and K are Hermitian, then HK and KH have the same spectra. 
[Hint: Reduce the discussion to the case where one of the factors is regular.] 


Is exp (H + K) necessarily equal to exp (H) exp (K)? Show that this is true if H and 
K commute. 


Prove that the range of (Hx, x) with H Hermitian for ||x|] =1 is an interval and 
determine the latter. 


Let A,, A, ..-, Am be the distinct characteristic values of a Hermitian linear trans- 
formation H € €(C”) and let ῥ᾽ be the projection corresponding to 4;. Prove the 
spectral representation 

H=A,P, + A,P. te) + AP: 
Find the spectral representation of ἢ k if H is Hermitian and k is a positive integer. 
Find the spectral representation of exp (#1), H Hermitian. 


If H is Hermitian, find the spectral representation of its resolvent. [Hint: Remember 
there are no nilpotents.] 
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2 LINEAR VECTOR SPACES 


In the preceding chapter it was found that the discussion of complex Euclidean 
spaces leads to a large number of concepts which would seem to have a bearing on 
more general situations. Among such concepts we list in alphabetic order: algebra, 
convexity, distance, functional, inner product, invariant subspace, linear trans- 
formation, matrix, norm, orthonormal system, projection, resolvent, space, 
subspace, vector, and vector space. There are others, but this sample will indicate 
the profusion of ideas that we have to play with. In this chapter we take a first step 
in this direction. We bring a number of definitions centering around the notion of a 
linear vector space. Many theorems will be stated, some without proofs which are 
postponed to later chapters. Examples of linear vector spaces are given in Chapters 
3 and 4. Analysis in linear vector spaces is discussed in Chapters 7 and 8; Banach 
algebras in Chapter 9; linear transformations in Chapter 10; inner product spaces 
in Chapter 11. Omitted proofs will be found in various places in these later 
chapters. 

The present chapter has six sections: Banach spaces; Linear transformations; 
Linear functionals; Linear operator spaces; Inner product spaces; and Banach 
algebras. 


2.1 BANACH SPACES 


In analysis we have often to consider classes of mathematical systems where the 
elements of the class are important as well as the structures of the system. Such a 
structure may be algebraic or topological or possibly both, in which case the algebra 
and the topology must be adjusted to each other. It is customary to refer to such a 
system as an abstract space. We use ¥ as symbol for the space; x,y, ..., for the 
elements which are usually called points or vectors. We say that ¥ has an algebraic 
structure if one or more algebraic operations may be performed on the elements or ifa 
notion of order is meaning ful. 

At this stage we shall not be concerned with general topological spaces. The 
only type of topology that will be considered in the first six chapters of this treatise 
is one based on a notion of distance. This leads to a metric space. 

In the preceding chapter we have encountered a number of abstract spaces 
with an algebraic as well as a metric structure. We recall the spaces C", M,, and 
€(C"). In all three we had notions of vector addition, scalar multiplication and in 
the latter two cases also multiplication of elements. There were also various 
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notions of distance. On the other hand, so far order between elements has played 
no role. The notion of partial ordering will come in Chapter 5. 
We now proceed to some general definitions. 


Definition 2.1.1. An abstract space X is called a linear vector space if the 
operations of addition and scalar multiplication may be performed on the elements 
of X subject to the following postulates: 


(A;) 


(A2) 
(A3) 
(Aq) 
(As) 


Any two elements x and y of X, distinct or not, have a unique sum x + y 
which is an element of X. 


Addition is commutative: x +y=ytx. 
Addition is associative: x + (y +z) = (Χ + y) +z. 
There is a zero element 0 such thatx + 0 =0+4+ x =x /for all x. 


For every x there is a negative, written —x, such that x + (—x) = 9. 


There is a field F of scalars, normally taken to be either the real field R or 
the complex field C. Scalar multiplication is subjected to the following 


postulates: 

(S,) To every number «€ F and every element x € X& there is a unique scalar 
product «x in ἃ. 

(82) (a + B)x = ax + fx. 

(92) a(x + y) = ax + ay. 

(S4) a(Px) = (af) x. 

(Ss) icp ae ν᾿ 


A “postulate” is an ‘“‘assumption”’ or a “rule of the game”’ or a “working 
hypothesis.”” It is an undefined term, not subject to proof. The only requirements 
that a system of postulates should satisfy are freedom from contradiction and being 
satisfied by one or more systems of interest to mathematics. 

These postulates involve another undefined term, that of equality, which in the 
applications differs from space to space. Whatever the definition is, we assume the 
following conditions to be satisfied: 


(E;) 
(E2) 
(E53) 
(E4) 


(Es) 


Equality is reflexive, x = x. 

Equality is symmetric, Χ = y implies y = x. 

Equality is transitive, x = y and y = Z implies x = Ζ. 

Equality is preserved under addition, x = y impliesx + z= y + Z for all 
τίη ἃ. 


Equality is preserved under scalar multiplication, x = y implies ax = ay 
for allae F. 


The postulates have a number of implications which will be stated as lemmas. 
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Lemma 2.1.1. There is one and only one zero element. 
Proof. If 0* satisfies (A,), then 0* = 0* + 0 = 0 and 0* = 0. Ε 
Lemma 2.1.2. Every x has one and only one negative. 


Proof. Suppose that x + y = 0. Then 
(-x)+x+y=-x+0=-x or 0+y=-x or y=-x 


as asserted. Here use is made of (A,) and (E,). fj 
Lemma 2.1.3, The negative of —x is x. 


Proof. (—x)+x=0 for all x. This says that x is the negative of —x and, since 
there is a unique negative, the assertion follows. ἢ 


Lemma 2.1.4. There is a Law of Cancellation: 
X+Z=y+t+z implies x =y. 
Proof. Add —z to both sides and use (A,) and (E,). Ε 


Postulates (52) and (S3) state that scalar multiplication and element addition 
are distributive operations. In (S;) the 1 is the unit element of F. 


Lemma 2.1.5. We have 0-x = 0 and —x = (—1)x for all x. 


Proof. Notethatx =1-x=(1+0):-x=x4 0-xso that 0: x is a zero element. 

Since the latter is unique, 0-x = 0. Next 
0=0-°x=[1+(-1)]:x=x+(-1)-x while 0=x4 (—x). 

Since the negative is unique, (—1)-x = —x. a 


We say that X is a real linear vector space if F = R, a complex linear vector | 
space if F = C. 


Let x,€X, 7 =1, 2,...,”. These vectors are linearly independent over F if 
αιΧ᾽ +X, +--+ 4,x, = 0, ae F, (2.1.1) 
implies αἱ =a, =--- =a, =0. They are linearly dependent if there exist αἷς not 
all zero such that (2.1.1) holds. The space X is said to be of dimension n, if there are 


n linearly independent vectors in X but any system of m +1 vectors is linearly 
dependent. The space is of infinite dimension, if n linearly independent vectors may | 
be found for any ἡ. | 


We assume the possibility of introducing a metric in our space. 
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Definition 2.1.2. An abstract space ¥ is said to be a metric space if for any 
ordered pair of points x, y a number d(x, y) is defined, called the distance from x 
to y and satisfying the following conditions: 


(D,) d(x, y) = 0 for all x, y and d(x, y) = 0 iffx =y. 
(D2) d(x, y) = dy, x). 
(D;) d(x, y) < d(x, 2) + d(z, y), the triangle inequality. 


We recognize the postulates given in Section 1.6. We shall have more to say 
about metric spaces in Chapter 5. Here it will be necessary to introduce some ideas 
connected with metric spaces before we turn over to the special case of a normed 
metric. 

In metric spaces many of the concepts of real analysis become meaningful if 
Euclidean distance is replaced by the notion of distance assumed to hold in the 
space. Suppose that x» ¢X. Then the set | 


ἔχ; d(X,x) <6, xEX} (2.1.2) 


is called an e-neighborhood of Xo. It is also known as an open e-sphere with center at 
Xo. Let S c ¥ bea subset of ¥. The point xo € Κ (but not necessarily to S) is called 
a cluster point of S, if every e-neighborhood of xy contains two distinct points of S. 
The union of S with its set of cluster points is called the closure of S and is denoted 
by 5. The set S is closed if S = S, dense in X if S = X. The set S is open in X if 
Χο € S implies the existence of an e-neighborhood of x the points of which are also 
in S. The set is nowhere dense in ¥ if every open set in ¥ contains an open subset, 
no points of which are in S. 
We also need the notion of a sequence 


{x,;x,EX, kk =1,2,...}. (2.1.3) 


The reader will find a discussion of sequences in Section 3.1. In real analysis 
sequences play a basic role in the discussion of convergence. Since there is a notion 
of distance in a metric space, we are able to discuss convergence in such spaces. 
We shall see, however, that the situation is not always so favorable in general metric 
spaces as it is in R or for that matter in any of the metric spaces considered in 
Chapter 1. 


Definition 2.1.3. The sequence (2.1.3) is said to be convergent if, for each ὃ > 0, 
there is an integer N = N, such that 
A(Xn»X,)< & for allm,n> N,. (2.1.4) 


The sequence converges to Χρ € X, or limX,, = Xo, if for each & > 0 
A(Xm Χο) < & form> N,. (2.1.5) 


A sequence satisfying (2.1.4) for each ¢ > 0 is called a Cauchy sequence. The 
metric space ¥ is said to be complete if every Cauchy sequence converges to an 
element of X. 
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The spaces Κ΄ and C” are complete for all n, the space of rational numbers is 
not, both in terms of the Euclidean metric. See further Section 5.2. 

We now restrict ourselves to linear spaces and a metric based on a norm. The 
latter concept has figured repeatedly in Chapter 1 but the definition is repeated for 
the sake of completeness. 


Definition 2.1.4. Α linear vector space ¥ over C (over R) is said to be normed if 
for each χε X there is defined a real number |\|x\| called the norm of x such that 


(N,) [|x|] 2 0 and |x| =0 iffx =0. 
(Ν2) |lax|] = [α[ |x|] for all « in C (or R). 
(N3) |x + yl] < ΙΧ]} + lly]. 


The distance in the linear normed space X is defined by 
d(x, y) = [χ — y||. (2.1.6) 


It is left to the reader to verify that this definition of distance satisfies 
Definition 2.1.2. 

In Chapter 1 the spaces C”, Κ΄, Mt,, €(C") were considered and shown to be 
normed linear vector spaces, complete in their normed metrics. 


Definition 2.1.5. A normed linear vector space which is complete under the 
normed metric is called a Banach space, B-space for short. 


The name is in honor of the Polish mathematician Stefan Banach (1892-1945), 
one of the founders of modern functional analysis. He introduced this range of 
ideas in his dissertation of 1922. Among his forerunners should be mentioned 
the Frenchman Maurice Fréchet (1878) and the Hungarian Frigyes Riesz 
(1880-1956). 

We give one more definition. 


Definition 2.1.6. A metric space is said to be separable if it contains a countably 
infinite set of points dense in the space. 


EXERCISE 2.1 


. Prove that C” is complete in the Euclidean ‘metric. 

. How should the argument be modified for the /,-metric? 

. Show that C” is separable in either metric. 

. Why is the space of rational numbers not complete under the Euclidean distance? 
. Verify that (2.1.6) defines a distance. 


N Ar & WH bt = 


. Is Nt, complete in each of the metrics defined in Section 1.4? Is it separable? 
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. On the real line R, can the expression 


ἰχ -- »} 
Ι Ὁ ἰχ -- } 
be used as a distance between x and y? If so, what is the upper bound of the 
distances? Can the expression be used as a norm? 


. Prove that a convergent sequence in a B-space is necessarily bounded. Is a bounded 


sequence always convergent? 


. In a B-space addition and scalar multiplication are continuous operations. Explain 


in some detail what the assertions mean and give a proof. 


Prove that in a B-space the norm of x is a continuous function of x so that 
lim x,, = Xo implies lim ||x,,|| = ||xoll. 

m— oo mo 

Prove that the Bolzano—Weierstrass theorem holds in Mi,,. In other words, a bounded 
infinite set of matrices in Wt,, admits of at least one cluster point. 


LINEAR TRANSFORMATIONS 


Consider two abstract spaces ἃ and 3) where ἢ) may conceivably be a second copy 
of X. The product space X x Y (or Cartesian product space) is by definition the 
set of all ordered pairs (x,y) with x eX, ye Y. 


Definition 2.2.1. A mapping T from X into Ἢ is a collection of ordered pairs 
(x,y), xe X, γε ἢ), such that every x of X belongs to one and only one pair 
(x,y). Here y = T(x) is called the image of x induced by T and x is the pre- 


image of y. 


Note that y may be the image of several points x and it is not excluded that all 


of ¥ may be mapped on a single point y. Such instances were encountered in 
Section 1.4. 


So far the spaces are arbitrary. We now specialize: ¥ and “ἢ shall be linear 


vector spaces and T a linear mapping. 


the 


Definition 2.2.2. iet X and Ἢ be linear vector spaces over the same scalar field F . 
T is said to be a linear mapping (equivalently, a linear transformation or a linear 
operator) from X into Ἢ if for all x,,x,¢€X and alla, peF 


T(ax, + Bx,) = «T(x,) + BT(x;). (2.2.1) 


Note that here we are dealing with equality in 3), which may be different from 
notion of equality in X. The same observation applies to the algebraic 


operations. 


If T is a linear transformation, then 


T(0) = 0. (2.2.2) 
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Here on the left we have the zero element of ¥ and the assertion is that its image is 
the zero element of 3). It is not customary to distinguish between zero elements in 
different spaces by differences in notation. The context shows which zero element is 
meant. To prove (2.2.2) note that 


T(0) = TO + 0) = T(0) + Τίθ), 
which implies (2.2.2). 
In general, the zero element of 3 is not the only element of X mapped into the 
zero element of ἢ. The set of all such vectors is the null space or kernel of T. Thus 


NET] = {x; T(x) = 0}. (2.2.3) 


Just as in C”, there is a profound difference between the two cases N[T] = {0} 
and M[T] # {0}. This will be taken up below at least for B-spaces. 
We now specialize still further: X and Y shall be B-spaces. 


Definition 2.2.3. If X and ἢ are Banach spaces and T is a linear transformation 
from ἃ to ἢ, then T is said to be bounded if there exists a constant M such that 
for all x 

| T(x)|| < ΜΊΧΙ!. (2.2.4) 


Here again we are dealing with two different versions of the same concept: on 
the left is the norm as defined in 3), on the right as defined in X. If confusion is likely, 
we mark the norms by affixing suitable subscripts. 

If there is an M satisfying (2.2.4), then any larger M will also do. The greatest 
lower bound of the set of admissible M’s is called the norm of T, written ||T||, and 
it may be defined directly as follows: 


|T | = fae ITI. (2.2.5) 


Note that to define the norm it is enough to consider the unit sphere of X, for any 
x, X # 0, not on the sphere, may be written x = rx, where ||x,|| = 1 and r = ||x!. 

Actually ||T || is the norm of T in still another space, that of all Jinear bounded 
transformations from X to Y, denoted by &(X, ἢ). If Y = X, we write simply E(%). 
Here “‘©”’ stands for endomorphism. For these operator spaces see further Sections 
2.4 and 10.1. There it will be shown that €(X, ἢ)) and ©(X) are linear vector spaces 
for a suitable definition of addition and scalar multiplication. Moreover, (2.2.5) 
defines a norm in €(X, Y) and a normed topology in terms of which the space is 
complete. This applies, in particular, to the space €(X), but here we have also a 
definition of element multiplication and €(X) is a normed algebra complete in its 
metric, a so-called Banach algebra. 

The reader will recall the space ©(C") defined in Section 1.4 as the set of all 
linear transformations from C” to itself. In this case “‘boundedness’’ is a con- 
sequence of “‘linearity’’, but in general for infinite dimensional B-spaces X this is not 
the case. All the properties stated for €(X) are obviously true for €(C"), so we know 
in advance that the proposed discussion is neither vacuous nor lacking in interest. 
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In the rest of this section we make some observations on individual trans- 
formations T in ©(X, %). Our first remark is attached to (2.2.3). 

Theorem 2.2.1. If N[T] = {0}, then the linear transformation T is 1-1, i.e. 

X, #X, implies T(x,) ¥ T(x,). (2.2.6) 


Proof. lf x, # x, and T(x,) = T(x,), then 
T(x, a X>) = 0, 
so that x; — x, € N[T] against the assumption. Jf 


Let R[T] be the range of T, i.e. 
RCT] = ty; ye ἢ, T(x) =y forsome xe X}. (2.2.7) 


If NET] = {0}, then each ye R[T] is the image of one and only one xe &. 
This says that the set of ordered pairs 


{(y,x); ye RIT], y = T(x), xe 3) (2.2.8) 


is also a mapping, in this case, of R[T] onto X and this mapping is also 1-1. We 
recall that “‘onto X”’ means that every x € X is second coordinate of one of the pairs 
(y, x), in fact of one and only one pair since the mapping is 1-1. We denote this 
second mapping by T'~' and call it the inverse of T. We have 


T-{[T(x)] =x, VxeX, (2.2.9) 
T[T-'@W]=y, Vye R(T]. (2.2.10) 


Theorem 2.2.2. Let T € €(X, Ἢ) define a 1-1 mapping. Then T~* defines a 
linear mapping of RUT] onto X. Further, in order that T~+ belong to ©(Q), 1) 
it is necessary and sufficient that (1) R(T] = Y and (2) there exists a positive 


number m such that 
| T(x)|| 2 mx], Vx. (2.2.11) 


Proof. That T~* is linear on R[T] is shown as follows. If 
γι = T(x), y2 = T(x,), then x, = T~*(y4), x, = T™*(y2) 
and the set R[T] is linear by definition so that 


T(ax, + Bx.) = ay, + By2 
implies that 
T~*(ay, + By2) = ox, + Bx, 
= aT *(y,) + BT ~*(y2) (2.2.12) 
as asserted. 
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The assumption that T~* € €(%, X) implies that Τ 7 is defined on all of 9, 
i.e. R[T] = ἢ, and T ~* is linear and bounded on 9). Now the boundedness implies 
the existence of a constant M > 0 such that for all y 


Ix = [1 0} < Mllyll = MIT@)I. (2.2.13) 


This implies (2.2.11) with m=1/M. Thus the conditions are necessary. 
In order to prove the sufficiency we note first that (2.2.11) implies that T is 1-1. 
For 


| Tx, — Χ2}} > mix, — x2] 


shows that the left member can be zero iff the right member is zero. Thus T ἷ 
exists and its domain of definition is R[T] = ἢ). Then from (2.2.10) and (2.2.12) 
for all y 


lyll = ITLT "(II > miT~*(y)I 


IT ~*~) < C/m)lly], 
so that T~* is bounded. This shows that T~1 € (9), 3). ἢὶ 


Or 


It anoule be observed that even if T is 1-1, its range need not be all of Y nor 
is T~* necessarily a ponte transformation on R[T]. If N[T] + {0}, then we 
lose the existence of T~* altogether and T is no longer 1-1. We saw in Chapter 1 
how this could happen for mappings in €(C"). For the first type of pathology we 
have to go to linear vector spaces of infinite dimension. Examples will be given 
later. 


EXERCISE 2.2 


1. Theset Nt, of all x by n matrices is a B-space (in fact, a B-algebra) under the definitions 
of algebraic operations and (alternative) norms given in Section 1.4. What is the 
general form of a transformation in €(M,)? 


2. Find necessary and sufficient conditions for such a transformation to be 1-1. 


3. (Οἵ, 1] denotes the space of all functions 2 — f(t) defined and continuous for 
Ὁ <t <1. It will be shown in Section 3.2 that this is a B-algebra under the obvious 
definition of the algebraic operations and with || || = sup | f(t)|. Define 

O<t<1 


TIS Mt) = τά + #7)7" f(2). 
Find [{|. 


4. The transformation 7 of the preceding problem is obviously linear. Is it 1-1? 
Find 9{7T] and [7]. 


5. With the same setting, is T~! bounded on RT]? 
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6. With the same space define U[f](t) =1 + ¢f(t). Is this a linear transformation? 
Could you state some sense in which U could be said to be bounded? 


7. Prove that a linear transformation from a B-space ¥ to a B-space 4) is continuous iff 
it is bounded. 


2.3 LINEAR FUNCTIONALS 


In this section we shall be concerned with the space €(X, C), assuming X to be a 
complex B-space. This is the space of all linear bounded functionals defined on X. 
It is referred to as the adjoint space of X and commonly denoted by X*. For its 
elements symbols like x* are used. 

We note that X* becomes a linear vector space if addition and scalar 
multiplication are defined by 


(x*, + x*,)(x) = x*,(x) + x*2(x), VX, (2.3.1) 
(ax*)(x) = ax*(x), Va. (2.3.2) 
Since the norm in C is the absolute value we define 


ee) = Se ee: (2.3.3) 
In this manner ¥* becomes a normed linear vector space. 
Theorem 2.3.1. X* is complete in the normed metric. 


Proof. Suppose that {x*,} is a Cauchy sequence in ¥*. Then, for each 8 > 0, 
there is an N = N, such that 
\|x*,, —x*,||<«¢ for m,n>N,. (2.3.4) 


Moreover, by Problem 7, Exercise 2.1, there is a finite M with 


lx*,I| <M, Wk. (2.3.5) 
Then for each χε 
Ix* n(x) — X*,(x)] = 1*, — x* COI 
< ||(x*, — χήν!" [[Χ]} < el]x|). (2.3.6) 


Now for each fixed x the sequence {x*,,(x)} is a sequence of complex numbers and 
(2.3.6) shows that this is a Cauchy sequence. Since C is complete 


lim x*,(x) = f(x) (2.3.7) 


exists as a linear complex valued function on X. Moreover, 


(Ὁ) Κ ΞΡ lx" ll ΠΧ} < ΜΊΧΙ], (2.3.8) 
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so that fe X*. By (2.3.6) 


lim |x* n(x) — x*,(x)| = [f(x) — x*,(X)| < el|x|. 


mo 


Now for [χ| = 1 


If) — x*,00| < é 


uniformly in x. Since 
aoe) — X*,0%)| = || f—- x", ll < « 
by (2.3.3). This shows that 
lf—x* || <6, n>N,, 

and 

lim || f— x*,|| = 0. (2.3.9) 
Thus every Cauchy sequence converges to a limit in the space and X* is complete 
and hence a B-space. ἢ 


In Problem 4, Exercise 1.2, the reader showed that a linear functional on the 
space C” is necessarily an inner product. Thus to each x* ε €(C", C) there is a 
y € C” such that 

x*(x) = (x,y), Vx, and ||x*|| = llyll. (2.3.10) 


Since inner products are not defined in general B-spaces, we cannot expect this 
result to hold except in inner product spaces. Nevertheless, it is instructive to 
examine some special cases involving functionals with particular properties which, 
as we shall see later, exist for all B-spaces even if the construction that is used here 
breaks down. 

We note first that (x, y) = 0 for all x iff y = 0, i.e. the zero functional is the only 
element of ¥* which annihilates all of ¥. Next set 


y= i Xo fixed #0, x*(x) = (x,y). (2.3.11) 
0 
Here 
(Xo, Υ) = [Xoll and |[x*|| = 1. 


Suppose that Xp, is a linear proper subspace of C” of dimension g <n. It is 
desired to find a linear functional on C” whose null space contains ¥, and which is 
of norm 1. We can find an orthonormal basis u,,u,,...,U,, U,+1, ..., U, such that 
the first g vectors form a basis of ἄορ. Now set 


¥Y ταῦ. Ὲ1 τ Αγ Ὁ 2 a She A, —gll,, 
a (2.3.12) 
2 Ια, ἢ =1, x*(x) = (x,y). 


Then 
x*(x)=0, VxeX, and [{[χἤ] =1. 
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Thus the null space of x* contains X, but may conceivably be a larger linear sub- 
space. It cannot be all of X, however, since, for example, x*(ay) = « 4 Ὁ unless 
a = 0. 

As the last problem, we show how to construct a linear functional on C” which 
distinguishes between two given elements x, and x, where x, # x,. It is desired 
to find x* € &(C", C) such that x*(x,) # x*(x,). We may assume x, and x, 
different from zero since the case where one of them is zero is trivial. If x, and x, 
are orthogonal to each other, then the functional (x, x,) will give a solution since 
(x,,X2) = 0 while (x,,x,) = |x, ||? > 0. If (x,,x,) 4 0 and x, ¥ yx,, then there 
exists a vector z orthogonal to x, but not to x,.. Hence (x, z) # 0 for x = x, but 
is equal to zero for x = x,. Thus (x, z) solves the problem in this case. Finally, if 
X; = YX, we have y #1 by assumption and (x, x2) has the desired properties. 

We have carried out this discussion in C” in some detail because functionals 
with such properties exist in all B-spaces. This will be proved in Chapter 10, but we 
state the result here for future reference. 


Theorem 2.3.2, Every B-space has infinitely many linear bounded functionals. 

In particular, there exist functionals with one of the following properties: 

1) If xX 9 €X, Xo #0, then there exists an xo* € X* such that xo*(Xo) = 
and ||x9*|| = 1. 

2) If X, is a linear subspace of X, not dense in X, then there is an x,* € X* 
such that x,*(x) = 0, Vx € Xo, and ||x,*|| = 1. 

3) If x, and χ, εξ, x; τ Χ;, then there exists an x,*eEX* with 
X2*(X1) A X2*(X2). 


[Xo || 


From (3) we conclude that the only element of X¥ that annihilates all functionals 
is the zero element, while from (1) it follows that the only functional which annihilates 
all of X is the zero functional. B-spaces have a profusion of linear bounded 
functionals. Not all linear vector spaces are so richly endowed. In fact, there are 
some where the zero functional is the only linear bounded functional. 


EXERCISE 2.3 


1, Determine the linear functionals on the space Mi,. 
_2. Discuss the validity of Theorem 2.3.2 in this space. 


3. Let C[O, 1] be defined as in Exercise 2.2, i.e. as the linear vector space formed by 
functions ¢ — f(t) defined and continuous on the closed interval [0, 1] and normed 
by the sup-norm. Let fo be fixed, 0 < ᾧ <1, and define L[ f] = f(t) for all 
fe C[0, 1]. Show that this is a linear bounded functional on the space. Find the 
norm of L. 


4. Let JtLL] denote the null space of this functional. Show that it is a convex subset of 
X = C(O, 1]. 
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5. Let f, g, ἢ be elements of X where fand g ε NL] and ἢ is arbitrary. Show that f+g 
and fh eN{L]. In algebraic parlance: 911] is an ideal of the B-algebra C[0, 1]. 


6. Show that tc f(t) dt = L,[f] is also a linear bounded functional on C[O0, 1]. Find 
the norm of L,. 
7. Characterize 31 {1..]. Is the null space an ideal in the sense of Problem 5 above? 


8. Show that L[ f] is multiplicative as well as linear, ie. LL fg] = Lf] L{g], and verify 
that L,[ f] lacks this property. 


9. Construct two functionals which distinguish between two given elements f and g of 
C(O, 1]. 


2.4 LINEAR OPERATOR SPACES 


Let ἃ and Ὃ be two B-spaces over the complex field. The spaces &(X, 9) and G(X) 
were defined in Section 2.2. A closer examination is now in order. 

Here €(X, Y) is the set of all linear bounded transformations T from X to ἢ. 
We have also defined the norm of T by 


ITI = sup ITC) ; (2.4.1) 


Our first order of business is to show how the algebraic operations can be defined 
in €(X, Y) so that it becomes a linear vector space. To this end we define sums and 
scalar multiples by 


(T, + T,)(x) = T,(x) + T>(x), (2.4.2) 
(aT )(x) = «T(x). (2.4.3) 


Here « is arbitrary in C while T, T,, T, are arbitrary in €(X, 9). These conventions 
are natural and involve no difficulties. 
Next we observe that 


(7, + T|| = Στὴ | T,(x) + T,(x)| 


mere, IT; (x)|| + ane T(x) =  {7|}} + 17] (2.4.4) 


and ||aT || = |q| ||T |. Thus the norm already introduced satisfies conditions N, to 
N;. Further, we define 
a(T,, T)) = | T, — T,|I. (2.4.5) 


We have now to show that €(X, 9) is actually a B-space under the normed 
metric. This has already been shown for the special case Y = C and the argument 
given in Theorem 2.3.1 may be followed step by step. 


Theorem 2.4.1. The space €(X,Y) is complete under the normed metric. 
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Proof. Suppose that {7,} is a Cauchy sequence in €(X, ἢ). Then for each ¢ > 0 
there is an N = N, such that 


ΓΤ, - Trl <& mi n>N,, (2.4.6) 
and for suitable M 
7, <M, Vk. (2.4.7) 
For each χε ¥ 
| T(x) — T(x) = CT, — ThCOW < Tin — Trill xl < ellxil. (Δ.4.8) 


This shows that {7,(x)} is a Cauchy sequence in the complete space Y for each x. 
Hence 
lim T,(x) = T(x) (2.4.9) 
k~> 


exists as an element of 8). Moreover, 


T(x) < sup Till xl < ΜΊΧΙ. (2.4.10) 


Thus (2.4.9) defines a mapping from X into ¥ which is obviously linear and 
(2.4.10) shows that it is also bounded. Hence T ε €(X, ἢ). We have further 


lim {7|,(Χ) — Τ᾿ (Χ)}} = IIT) — 1, (Χ)} < εἰ Χί. 


Now for ||x|| = 1 
| T(x) — T,G)Il < « 
uniformly in x. Since 
sup ||T(x) — Τ, (Χ)]}} = IT -- Thi, 


ΠΠΧ|{Ξ:1 
we see that 
ΠΤ " 7,} <& n> IN, 
and 
lim ||T — T,|| = 0. (2.4.11) 


Thus every Cauchy sequence in (ΞΖ, %)) converges to a limit in the space, which is 
hence complete in the normed metric and a B-space. i 


If Y = X we are dealing with the space €(X). Here products of elements may 
also be defined. For if T,, T, ε €(X), then T,(x) is defined and is an element of X 
for each xe X. But then 7,[7T,(x)] is also defined and is in X. Thus T,[T,(x)] 
defines a linear mapping from 3 into itself. Moreover, it is bounded for 


fe IT: ΓΤχ(Χ}}}} < | mi ane Tox = UTI 7}. (2.4.12) 


We write 
T,LT,(x)] = [T, T2](x) (2.4.13) 


and define this transformation as the product of the operators T, and T, in this order. 
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Note that we operate first with T, on x and then apply T, to the result. This type of 
multiplication is normally non-commutative, i.e. 


IT, # T,T;. (2.4.14) 
From (2.4.11) we get 
IT, Tol] < {{7|}} Toi. (2.4.15) 
Multiplication is associative 
(Τ, 1.) T; = T, (T,T3) (2.4.16) 


and distributive with respect to addition 
(Τ, τ 7.) T; = T, T; + T,T3, (2.4.17) 
T, (T, Ἔ 7.) ΞΞ T; T, + T; T3. (2.4.18) 


We say that €(X) is a Banach algebra in the sense to be defined in Section 2.6. 
€(X) has a unit element, the identity operator I with the properties 


I(x) =x, Vx, (2.4.19) 
IT =TI=T, VT. (2.4.20) 


In an algebra some elements have inverses. We say that S ε G(X) is the inverse of 
T € &(X) if 
(ST)(x) = (TS)(x) = x, Vx. (2.4.21) 


Here S is T~* provided this transformation exists and is in (X). For the existence 
of T~* see Theorem 2.2.2. The inverse is unique if it exists. 
Let A be a complex number and consider the operator 


1) = Al — T, (2.4.22) 


where T ε €(X) is fixed. The values of 1 fall into two mutually exclusive classes 
according as Τὶ * exists as an element of G(X) or not. In the first case we say that A 
belongs to the resolvent set p(T) of T, in the second to the spectrum a(T). In the first 
case we write 


T, 1 = R(,T) (2.4.23) 


and call it the resolvent of T. The set o(T) is bounded and closed, the set p(T) is 
unbounded, open and need not be connected. The special case X = C" was con- 
sidered in Section 1.5. Here o(T) or o(A) in our previous notation was a finite 
point set containing at most n distinct points. Things are evidently much more 
complicated in the general B-space case. We note, however, that a considerable 
part of the analytical properties of the resolvent carries over to the general case. 
See Chapter 9. Here we shall merely carry over Lemma 1.5.1. 
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Lemma 2.4.1. The domain \A| > ἢ ΤΊ in the complex plane is a subset of p(T) 
and for such values of 4 


RA, T) =A UAT --λλὴτ He FAM IT Ee (2.4.24) 


is a valid representation where the series converges in norm. 


Proof. Since the powers of T are defined for all n by recurrence the individual 
terms of the series have a meaning. The partial sums of the series form a Cauchy 
sequence in &(X) for |A| > ||T'|| since the series 


ΣΊλ 5 1}ΤΠ]}" 


converges for such values. Thus the sum of the series is a well-defined element of 
( (3). If the series is multiplied by AJ — T, either on the left or on the right, and the 
powers of 1 are collected, then all terms cancel except the coefficient of 2°, which is I. 
Hence 

QI — Τὴ RQ, T) = RV, ΤΊἋΙ -- T)=1 
as asserted. fj 


EXERCISE 2.4 


. Show that 7(x) is a continuous function of x on ἃ if Te G(X, Y)). 

. Show that addition and scalar multiplication are continuous operations in &(X, Y). 
. Show that element multiplication is continuous in (X). 

. Verify (2.4.16)-(2.4.18). 


. If Te &(X), how are the positive integral powers of 7 defined? Verify the law of 
exponents. 


6. If Te G(X) and T~! exists and is in G(X), show that 


ΑΔ F&F WO NH "αὶ 


(Ty ary * 
eM Tu toe ἢ T,~ ' € &(X), show that 7,7, has an inverse and 
(TT) 1 = 1. Τι ". 
8. If 7,, Τ᾽, € G(X) and 7,7, has an inverse in €(X), show that 7, and 7;; must have 
inverses in &(X). 


9. A bounded linear operator P € G(X) is called a projection if ΡΖ = P. Suppose 
P # I and determine o(P). Find Κλ, P). [Generalization of (1.5.37).] 


10. Same question for an operator Te €(X) such that Τ΄ = 47. [Generalization of 
Problem 5, Exercise 1.5.] 


11. Prove the first resolvent equation for R(A, T). [Generalization of Problem 13, 
Exercise 1.5.] 
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12. An operator Te €{C[0, 11} is defined by T[f](t) = tf(t). Find its resolvent and 
spectrum. 


13. Same question for 7[f\(t) = τῇ + ¢7)~1 f(r). 
14. If (2.4.21) holds for S, Te ( (3), why is S unique? 


2.5 INNER PRODUCT SPACES 


Let X be a linear vector space over the complex numbers. Inner products were 
introduced for C” in Section 1.2. Here follows the generalization to more general 
spaces. 


Definition 2.5.1. X is an inner product space (also called a pre-Hilbert space), 
if, for any ordered pair of vectors x, y € ¥, an inner product (x, y) is defined subject 
to the following conditions: 


1) (x,y) is a complex number. 

2) (x,x) is >0 and (x,x) =0iffx - 0. 
3) (y, x) = (x,y). 

4) (ax, + Bx2,y) = a(x,, y) + B(x, y). 


Various facts are direct consequences of these postulates. Thus (3) plus (4) 
gives 


(x, xy, + By.) = &(x, y,) + B(x, y2). (2.5.1) 
This relation, together with (4), expresses the bilinearity of the inner product. 
Since 
(x, 0) = (x, 0 + 0) = (x, 0) + (x, 0), 
we see that 


(x,0)=0, (0,x)=0, Vx. (2.5.2) 
Lemma 2.5.1. If (x,y) = 0 for a fixed y and all x, then y= 0. 


Proof. We have, in particular, (y, y) = 0, so by (2) y= 0. ff 


We have the following extension of Theorem 1.2.1. 


Theorem 2.5.1. For all x,y eX 
I y)I? < (x, x)(y, y) (2.5.3) 
with equality iff y = yx for some number y. 


Proof. Wecan follow the pattern of Theorem 1.2.1. Leta be a complex number to 
be disposed of later. Then 


0 < (y — ax, y — ax) = (y, y) — a(x, y) — &y, x) + Jol? (x, x). 
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We may assume x # 0 since otherwise there is nothing to prove. Under this 
assumption set 


_ x) 
(x, X) 


to obtain 2 
ty, x) 

0<(yy)- =, 
(x, x) 


which is (2.5.3). It is clear that equality holds iff y is a constant multiple of x. a 


This inequality is a generalization of the classical inequalities of Cauchy and of 
Bounyakovsky-Schwarz. 

Properties (4) and (2.5.3) imply that for fixed y the inner product (x,y) is a 
linear bounded functional on ἃ. We introduce a metric in X by 


Theorem 2.5.2. The convention 
[Χ] = (x, χ) "ἢ (2.5.4) 
defines a norm in X¥ and a normed topology by 


d(x, y) = ||x — y|. (2.5.5) 


Proof. Itis clear that postulates (N,) and (N,) are satisfied by virtue of properties 
(2) and (4). To prove (N;) consider 
Ix + yl? =(+y,x+y) = (x) + & y) + (, Χ) + Wy) 
< |\xl]? + 2|(, y)| + llyll? 
< [χ]| + 2 [|x|] lyll + llyll? 


= (1Χ|} + llyll)*, 
and this implies (N;). fj 


Corollary. The norm of the linear functional x*(x) = (x, y), where y is a fixed 
element of X, equals |\y||. 


Proof. Definition of the norm plus (2.5.3). a 


The verification of the following theorems is left to the reader. The first is 
known as the Parallelogram Law, the second as the Extended Parallelogram Law, 
the third as the Pythagorean Theorem. Cf. Problems 4 and 5, Exercise 1.2. The 
“original”? Pythagorean theorem was familiar to the Babylonians a good 1000 years 
before Pythagoras. 


Theorem 2.5.3. For any two vectors X,,X,¢€X 


Ix, + Xl]? + Wx, — Χ}7 = 20x? + [Χ2}}71. (2.5.6) 
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Theorem 2.5.4. For any n vectors x,,X,...,X, in ¥ 


n 
py x; 
j=1 


Σ, WX; — Xell? + 


1<j<k<n 


2 n 
Ξ μὴ 2, |x, l]?. (2.5.7) 


Definition 2.5.2. The vectors x and y are orthogonal if 
(x, y) = 0. (2.5.8) 


We also say that x and y are perpendicular, written x 1 y. 


The Pythagorean theorem is a corollary of Theorem 2.5.3 and the definition 
of orthogonality. 


Theorem 2.5.5. If x Ly, then 
|x + yl]? = xl]? + [|ν]. (2.5.9) 


Definition 2.5.3. An inner product space is called a Hilbert space if it is 
complete in the normed metric. 


We use a German capital § as a generic notation for a Hilbert space. The 
name is in honor of the German mathematician David Hilbert (1861-1943), one of 
the last giants in our science. 

The various spaces C” are elementary instances of a Hilbert space. More 
sophisticated examples will be found in Chapters 3 and 4. Since § is a linear vector 
space over the complex numbers, the notions of linear independence of vectors and 
dimension of the space are meaningful. If § is of finite dimension n, then there is 
a 1-ἰ correspondence between § and C" which preserves distance and the algebraic 
operations. The first property is known as an isometry, the second as an iso- 
morphism. We shall assume here that § is infinitely dimensional. 

This assumption implies that for every n there is a set of n linearly independent 
vectors. Without restricting the generality we may assume that in passing from n to 
n +1 we simply add a vector v,,, to the set v,,V>,..., ¥, already obtained. This 
means that there is an infinite system of vectors v, in § any n of which are linearly 
independent over C. From this system we can pass to an orthonormal system 
(OS) {u,} by the Gram-Schmidt orthogonalization process. See Sections 1.1 and 
1.2. The process applies to § just as easily as to C” and we obtain 


γι V> ve Vi 
(V1, νυ.) (V1,V2)... (V1, Vx) 


UW = Ce) (V2,V1) (V2, V2)... (V0 V,) (2.5.10) 


(Vi 15 Vi) (V,-1,V2) ... (ν,.-- 15 Vz) 
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where c, is a real normalization factor. This formula is (1.2.9) in an obvious 
extension. We have 
(u;, Uy) = Oj (2.5.11) 
where 6,, is the Kronecker delta. 
Suppose now that x is any vector in § and form the numbers 


ἃ, = (x,u,), k=1,2,.... (2.5.12) 


The circumflex in the left member is known as a “hat” in the vernacular. These 
numbers are by definition the Fourier coefficients of the vector x with respect to the 
OS {u,}. With these coefficients we form the corresponding Fourier series 


δ ἄμα, (2.5.13) 


These objects are named after the colorful French mathematician Joseph Fourier 
(1768-1830), who participated in Napoleon’s expedition to Egypt in 1798-99 where 
he served in the French military government as chief of jurisdiction and secretary 
of the Egyptian Institute in Cairo. He directed the first archaeological survey of 
Egypt. After his return to France he was prefect of the department of Isére for 
14 years. His most famous work in mathematics is his Théorie Analytique de la 
Chaleur (1822), where a profusion of orthogonal series occur. 

Let us return to the series (2.5.13). What is the meaning of the series? We shall 
show that the series converges, though normally not in norm, but the partial sums 
form a Cauchy sequence in §. This raises a further question: do the partial sums 
converge to the element x with which the series is associated through (2.5.12)? 
This does not follow and one of our tasks is to find conditions under which this 
holds. 

We start by proving 


Theorem 2.5.6. The partial sums of (2.5.13) form a sequence of best 
approximation to x in § in the sense that for any n and any choice of n numbers 


Che ἀξ 


2 n 
> |x|? — sy ΕΝ (2.5.14) 


n 
A > C,U;, 
k=1 


with equality iff c, = &, for all k. 


Proof. To simplify the formulas we use c* for ¢, the conjugate of c, in this proof 
and the following. We have 


2 n n 
= |x|? ἘΞ >, c,* (x, U,) — > C,(U,, X) 


n 
Χ-- > cy 
κΞι 


n 


n 
+ 2 >, C;Cx* (uj, U,)- 
᾿Ξ 


k=1 
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This equals a Η ΒΞ 
|x|]? -- > Che — >, OS® + > le,|*. 
k=1 k=1 j=1 


To this we add and subtract 5 
Σ Wel? 


and obtain ᾿ : 
Ix? -- Σ᾿ [%17 + Σ le, — ζμ7. 
k=1 k=1 


This gives (2.5.14) with equality iff c, = %, for all k. Jf 
Corollary { Bessel’s inequality]. We have 
2 [Χ.12 < {χ| 2. (2.5.15) 
This is named after the German astronomer Friedrich Wilhelm Bessel (1784-- 
1846), who noted a special case involving trigonometric Fourier series. He also gave 
his name to the Bessel functions. 


Theorem 2.5.7. The partial sums of the series (2.5.13) form a Cauchy sequence. 


Proof. For 1<m<n< _o we have the identity 
2 


p> XU, — 2 [<el*, (2.5.16) 


Here the right member goes to zero as m~ oo by virtue of (2.5.15) and the 
assertion follows. ἢ 


We write temporarily X for the limit of the Cauchy sequence. The problem is 
now to find the relation between x and X. The solution is furnished by the 
following theorem. 


Theorem 2.5.8. The following assertions are equivalent: 


1) The set of all finite linear combinations of the u,’s is dense in Ὁ. 
2) Sr 1 [Χμ = |x]? holds for all x. 

3) (xy) = Σ κε, ἀκ O,)* holds for all x and y. 

4) &, = 0 for all k iffx = 9. 

5) & = x for all x. 


Proof. We plan to show that (1) Ξ» (2) » () 5» (0 -- ()5-- (1) Here the 
symbol Ξ is read “‘implies.”’ 


(1) > (2). If (1) holds and ς > 0 is given, then there is an n and m numbers 
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C1, C2, ..., C, can be found such that 


2 


x— > c,u,|| < εἷ. 


n 
=1 


k 


By (2.5.14) the left member is at most equal to 
Ixll*-— >) ΙΖ, 
k=1 


so that the difference is at most equal to e”. Here « is arbitrary and n goes to 
infinity when ὃ goes to zero. In other words, equality holds in (2.5.15) and this is the 
assertion of (2), known as Parseval’s identity. 


(2) => (3). If is an arbitrary complex number and x, y two arbitrary vectors 
in §, then (2) implies that 
Ix + αν] = QD [Xe + αϑμ 


with obvious notation. Here the left member equals 
[xl]? + a(y, x) + a*(x, y) + lal? lly|l? 


while the right member equals 
Z|? + ΔῸΣ (ἀρ Dye τ a* 2 Xe(Du)* + lol? Σ ||. 


M8 


k=1 


These two expressions are equal for all values of «, which means that the coefficients 
on both sides of the identity must be equal. Equating the coefficients of αὖ we 
get (3). 

(3) => (4). If for some x € § all Fourier coefficients £, are 0, then by (3) the 
inner product (x, y) = 0 for all y and by Lemma 2.5.1 this implies x = 0, which 
is (4). 


(4) 5» (5). Let 1 « m <n and consider 


n n 
( De X Uy, ,} -- >» SX, (U,, U,,) ae as 
k=1 k=1 
Now let 1» — oo. The limit of the first member is 


(X, Un)> 
(why ?), so that X and x have the same Fourier coefficients. This means that all the 
Fourier coefficients of X — x are zero. By (4) this requires that X — x = 0, which 
is (5). 
(5) => (1). Let χε § be arbitrary and suppose that X = x. This implies 


κῳ 


lim = |x — ΧΙ = 0. 


n-> co 


n 
x— > Ruy 
kK=1 
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The partial sums of the Fourier series of x are finite linear combinations of the u,’s 
converging to the given element x. Since x is arbitrary, this says that every element 


x of § can be approximated arbitrarily closely by linear combinations of the u,’s, 
1.6. (1) holds. β 


Another way of formulating these results is to say that the OS can serve as a 
basis for Ὁ iff the system is maximal. Here the statement that OS is a basis for Ὁ 
means (5): for every ΧΕ ὦ 


x = > XU, (2.5.17) 


where the series converges in the sense of the metric. This is occasionally still 
referred to as mean square convergence. On the other hand, the statement that OS 
is maximal or complete means that OS cannot be a proper subset of another 
orthonormal system. For this would violate (4) since it would imply the existence 
of a non-zero vector orthogonal to all vectors of OS. 

The next notion to be studied is that of Jinear closed subspaces of Ὁ and their 
orthogonal complements. Given a set S, finite or not, of vectors in §, there is a 
closed linear subspace It < § spanned or generated by δ. By definition Mt contains 
all finite linear combinations of vectors in S as well as all limit points of Cauchy 
sequences whose elements are such linear combinations. tis the smallest closed 
linear subspace of § with these properties. 

Now let IN be a closed linear subspace of § and Mt 4 §. Let x be a point of H 
not in I. The distance d(x, MN) of x from Mt is by definition inf ||y — χί = δ, where 
y ranges over Wt. Here 6 > 0 for if ὃ = 0, then x would be limit point of elements 
in Jt and hence in M since Mis closed. Geometric intuition is not very reliable in 
a space of infinite dimension, but it does suggest that there is a point Z) Ε Wt such 
that ||x — Ζο! = 6. Moreover, the vector x — z, should be perpendicular to Zp. 
This will now be proved. 


Theorem 2.5.9. If Wt is a closed linear proper subspace of Ὁ and if x is a point 
of § not in WM, then there exists a unique point Zo E Wt such that d(x, M) = 
|k — Ζο!!. The vector x — Zo is orthogonal to all of M. 


Proof. We have remarked that d(x, Jt) = 6 > 0. Then there exists a sequence of 
points {z,} Ε Mt such that 
|x — Ζ,]! > ὃ 


by the definition of the infimum. It is to be shown that {z,} is a Cauchy sequence. 
To this end consider the vectors x — z,,and x — z,. Their sum is 2x — z,, — Z,, their 
difference z, — Z,,. By the Parallelogram Law (Theorem 2.5.3) 


Zn — Zul]? + 2X — Ly — Zall? = 2E|X — Zl]? + |x — 2,17], 
and this may be written 


\|Z,, χὰ ἷ1,,}2 = 2||x a Zmll” Ἔ 2\|x _ Δἷ,|7 75: 4|Χ _ (Zn ΞῈ Z,)\|7. 
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Since M is linear and convex, 4(z,, + Z,) € Mt. In the last display the first term on 
the right goes to the limit 267 and so does the second term, while the third may not 
have a limit as m and n > co but in any case stays < — 46”. It follows that the 
superior limit of the right member is <0 and this requires that it has the limit zero. 
Hence {z,} is a Cauchy sequence and we denote its limit by zo. It belongs to M 
since Wt is closed. Further, 
|x — Zo|| = lim ||x — z,|| = ὃ (2.5.18) 
noo 

by the continuity of the norm. Thus Zp is a point in I where d(x, y) reaches its 
minimum for ye Mt. 

There is one and only one such point. For if there were two, z) and Wo, both 
at the distance ὃ from x, then we consider the vectors x — Ζρ and x — w, and use 
the Parallelogram Law again to obtain 


46° = 2||x — Zo||? + 2|χ — woll? 
= [Zo — Woll* + 41x — 4(Zo + Wo)ll? > llzo — Woll? + 46? 


or ||Zo — Woll* <0. Here equality must hold so that wy = Ζο and the point of M 
nearest to x is unique. 

To prove the orthogonality of x — z to Mt, we consider any point y of M and 
any complex number «. Form 


[Ζο + ay — x\|? = [Ζ0 — χ 2 + αἴ — x,y) + ay, Ζο — x) + [α2 Jy]. 


The term on the left is >6? since z) + ay ε Mt and the first term on the right 
equals 6*. It follows that 


2R[a*(Zo — x, y)] + lal? llyll? > 0, (2.5.19) 


where ® denotes “‘the real part of.” Here there are two possibilities: (z) — x, y) = 0 
or is £0. In the first case we are through. In the second case we note that the real 
part may be positive or negative and the sign is at our disposal since ἃ is arbitrary. 
Moreover, making [χα] small we can make the second term in (2.5.19) small in 
comparison with the first. This implies that the left member of (2.5.19) is capable 
of taking on negative values. Thus the second alternative leads to a contradiction 
and the first one must hold. ἢ 


Consider again the closed linear subspace It 4 §. If x is a point of HOM, 
then, as we have seen, there is a unique point z, of Mt nearest to x and Zo — X is 
orthogonal to all of MN. Since d(x, M) = 5 > 0, there are points ye ῷ Ὁ Mina 
d-neighborhood of x. Each such point y has a nearest point wy, € Mt and Wo ΟΥ̓ 
is orthogonal to all of Mt. Normally z) — x and wy — y are linearly independent. 

Denote the set of vectors orthogonal to Mt by M+, colloquially if not elegantly 
known as “§t-perp.”’ This is a linear closed subspace of §. For if vy, and Vv, are 
orthogonal to WM, so is αν, + fv, by the linearity of the inner product. Further, if 
{v,} is a Cauchy sequence in St! which converges to vo, then 


0=(z,v,) implies (Ζ, γ0) = 0 
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by the continuity of the inner product which is implied by the continuity of the 
norm. Thus yy « Mt}. 
It is clear that Mtn Mt+ = {0}. Further, 


Mi, <M, > Mt «- Mt. (2.5.20) 
Somewhat less obvious is 
(M+)E = M. (2.5.21) 
The proof is left to the reader. 
If Γ and 8 are two linear subspaces of a linear vector space ¥ and An B = 
{0}, then the set 
B={x+y;xeU, ye B} (2.5.22) 


is known as the vector sum of 91 and 3. 


Theorem 2.5.10. If Mis a closed linear subspace of § and M is the set of all 
vectors of $ orthogonal to Wt, then the vector sum of M and Me is all of Ὁ. 
If w is any vector of Ὁ there exists a unique representation of w of the form 


w=u-+y, ueM, ve Mt. (2.5.23) 


Proof. Denote temporarily the vector sum of I and M+ by BW. It is clear that B 
is a linear subspace of §. It is also closed. For suppose that 


w, =u, + V,> u,c MW, v,ERt 


and {w,} is a Cauchy sequence in 938, Then 


(Un + Vin — U, — Vas Un + Vn — U, — Y,) 


= |Win τ ἽΝ +(, — U,, Vn — V,) is (Vin — Vn, Un — u,,) a Ree "" ν, 7. 


Here u,, — u, € Wt, v,, — v, € 31 and they are orthogonal. By the Pythagorean 
theorem 


[Qn + Vin) -- ἃ, + ν.}}" = Ul — αι 17 + Vin — Vall?- 


If the left member converges to zero as m and ἢ — oo, each of the terms on the right 
converges to zero. Hence {u,} and {v,} are Cauchy sequences. If they converge to 
Up and Vo, respectively, then u, + νὰ > Up + Vo € B by the continuity of vector 
addition. It follows that the vector sum YB is a closed linear subspace of §. 

Hence ¥+ is well defined. Now if ze 38:1, then z is orthogonal to all of M as 
well as to all of Mit. For z is orthogonal to every vector of the form 


x+y, xeM, yeDMt. 


In particular, for y = 0 we see that (x,z)=0, VxeM. Similarly for x = 0, 
(y,z)=0,Vye Mt. But if z is orthogonal to all of M@, then ze Me and if z is 
orthogonal to all of t+, then ze M by (2.5.21). Since Mm M-+ = {0} we have 
z= 0 and B+ = {0}. Hence B = §. 
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This asserts that every element w of § admits at least one representation of the 
form (2.5.23). But if one such representation exists, it is clearly unique. For if 


w=u, + Vv, =uUu,+ vp, 


then u, — u, = v, — v,. Here the left member belongs to M, the right to Mt+, and 
these subspaces of § have only the zero element in common. le 


We have proved that 


M+ = HOM (2.5.24) 
and we can express Theorem 2.5.10 by the formula 
H= MO Me. (2.5.25) 


This is a decomposition of § into the sum of two orthogonal subspaces having only 
the zero element in common. In view of this, Mt+ is called the orthogonal complement 
of Xt, and vice versa. 

Next we give an application of linear closed subspaces to the theory of linear 
bounded functionals of §. The set of all such functionals forms the adjoint space 
§*. Actually there is a 1-1 mapping of § into §* which is an isometry and a skew 
isomorphism in the sense that sums go into sums but scalar products into their 
conjugates so that y — x* implies 

dy > ἀχἕ. (2.5.26) 


Here x* denotes a generic element of §* and the bar indicates the conjugate 
complex number. We express this correspondence by saying that § is skew self- 
adjoint. We shall prove 


Theorem 2.5.11. If x* € §*, then there exists a unique element y € Ὁ such that 
x*(x) = (x, y) for all x in §. 
Proof. Consider the nullspace of x* 
MN = NEx*] = {x; x*(x) = 0). 


Here 9 is a linear closed subspace of § (why?), so there is an orthogonal com- 
plement 94. If now 9 should reduce to the zero element, then 9t = § and 
x*(x) = (x, 0) is the trivial representation of the functional as an inner product. 
Suppose that M+ contains an element z + 0. We shall show that there is a constant 
y such that | 


x*(x) = (x, yz). (2.5.27) 
To this end consider the decomposition 
x = (x — fz) + fz, 


where f is a number that will depend upon x. We choose f so that x — βζ ε %. 
This calls for 


_ x*(x) 
ΧὩ) 
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Note that x*(z) 4 0 since ze 91". We have then 


(x — fz, yz) = 0 
for any choice of y. Now y is disposed of so that 
x*(Bz) = (Bz, yz) = By\lz\l’. (2.5.28) 
This gives the condition 
yllz|? = x*@), (2.5.29) 


which determines y uniquely. We have then 


x*(x) = x*(x — Bz) + Bx*(z) = (x — Bz, yz) + B(Z, yz) = (x, yz) 
with y determined by (2.5.29). The uniqueness follows from 


(x, γ2) = (x, y) 


for all x implies y = yz. The mapping y — x* clearly has the properties of 
isometry and skew isomorphism. 


As the last topic of this section we shall give a brief discussion of the analogues 
of quadratic forms, polar forms, and Hermitian forms for Hilbert space. It is just 
an introduction, a preview of Chapter 11. 

Suppose that T ε (9), the space of linear bounded transformations from Ὁ 
into itself, and consider the linear bounded functional 


(Tx, y) (2.5.30) 


obtained by fixing y and letting x vary. This is the polar form analogous to (1.7.6) 
for the case § = Οὐ, For y = x we get the quadratic form analogous to (1.7.1). 
We have also an adjoint transformation T* such that for all x and y in § 


(Tx, y) = (x, T*y). (2.5.31) 
Here T* is a uniquely defined element of (9). If 
fe 2 (2.5.32) 


the operator T is said to be self-adjoint or Hermitian, and the corresponding forms 
(Tx, y) and (Tx, x) are Hermitian. 


Theorem 2.5.12. If T is Hermitian, then (Tx, x) is real-valued for all x and for 
||x|| = 1 its value belongs to the interval [—||T |, ||T |], where ||T || is the norm 


of T in (9). 


Proof. We have 
(Tx, x) = (x, Tx) 


by property (3) of the inner product. Since 
(Tx, x) = (x, Tx) 
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for Hermitian forms, (Tx, x) must be real. We have 
|(Tx, x)| < || Tx|] [xl] < |T] 
for ||x|| =1. Jf 
Actually at least one of the values ||T|| and —||T'|| is assumed by (Tx, x) on 
the unit sphere. Compare Theorem 1.7.8 and subsequent discussion. 
Just as in C”, we have 


Theorem 2.5.13. If S and T are Hermitian operators in ( (9), then so are S + T 
and aS, where « is real, while ST is Hermitian iff S and T commute. 


Proof. The first two assertions are immediate consequences of the properties of 
Hermitian forms. Since (ST)* = T*S* = TS it follows that ST is Hermitian iff 
ST =TS.§ 


We also note that for any complex number « 
(aS)* = “S*. (2.5.33) 


Furthermore, an arbitrary operator A € ( (9) admits of a unique representation in 
terms of Hermitian operators 
A=B+ iC, (2.5.34) 


where 


1 
Β-3(4 14", C=>(A-A?). (2.5.35) 


The operators B and C are known as the real and the imaginary parts of A, 
respectively. It should be noted that B and C are self-adjoint. This is obvious in 
the case of B; to prove the statement for C we have to use (2.5.33). The reader will 
observe that the involution defined by T > T* is a generalization of the elementary 
operation of conjugation, « -- ἃ, in the theory of complex numbers, with self-adjoint 
operators playing the part of real numbers. 


Definition 2.5.4. A € €() is said to be normal if A and A* commute. 


A necessary and sufficient condition for A to be normal is that its real and 
imaginary parts commute for 
A* = (B + iC)* = B* — iC* = B — iC. (2.5.36) 


Unitary operators are particularly important instances of normal operators. 
We can define U to be unitary if 


UUF = Ut =, 
or, equivalently, by requiring that U have an inverse and 
U-! = U*. (2.5.37) 
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Since 
(x, y) = (U*Ux, y) = (Ux, U**y) = (Ux, Uy), (2.5.38) 


unitary operators preserve inner products and, in particular, distances. Thus U 
defines a mapping of § onto itself which is an isometry as well as an isomorphism. 
Such a mapping is called an automorphism. Conversely, every automorphism of a 
Hilbert space is defined by a unitary transformation (why ?). 

The last topic featured in this preview of Hermitian forms is projections. 


Definition 2.5.5. A mapping P from § into itself is called a projection of ὦ 
onto the linear closed subspace M of H if x =ut+y, ueM, ve M+, implies 
Px = u. 


This convention makes the range of P 
REP] = M = {x; Px = x}. | (2.5.39) 


Theorem 2.5.14. The projection P onto M is an idempotent Hermitian operator 
and, if M τ {0}, then ||P|| = 1. 


Proof. It is clear that P is a linear bounded operator (why?) and thus belongs to 
€(H). Since 
P*x = P(Px) = Pu=u= Px 


we have P? = P and P is idempotent. Next, since (u, v) = (v, u) = 0 we have 
(Px, x) = (u,u + ν) = |[ul|* + (u,v) = lull? 
= (u + v,u) = (x, Px) 
or P* = P and P is Hermitian. By the Pythagorean theorem 
|x|]? = [ὰ + vi]? = jul]? + {γν]} 


Or 
lull? < [|x|]? implies ||Px|| < [Χ], 


so that [P| <1. If Mt ¥ {0}, then there is an x 4 0 in M and for such an x we 
have Px = x so that ||P|] = 1. 

It should be noted that I — P is also a projection which maps Ὁ onto Mt. 
The nullspace of P is N+, that of I — P is M. Conversely, if T is Hermitian and 
an idempotent and if 9t is the nullspace of T, then T is a projection of § onto N-. a 


EXERCISE 2.5 


1. Prove Theorem 2.5.3. 
2. Prove Theorem 2.5.4. 
3. Prove Theorem 2.5.5. 


4. When is the series (2.5.13) convergent in norm? 
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10. 


11. 


12. 


13. 


14. 


15. 
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. State and prove the analogue of Theorem 1.7.1 for a Hilbert space. 

. Same question for Theorem 1.7.2. 

. Prove the existence of a transformation 7* satisfying (2.5.31). Why is ΤῈ unique? 
. Show that the decomposition (2.5.34) is unique. 

. Verify that the mapping y — x* of Theorem 2.5.11 from Ὁ into §* is skew self- 


adjoint. 

If a Hermitian operator 7 has a characteristic value Ay and T(X9) = ApXp,||Xo |] = 1, 
show that A, is real and find bounds for Ap. 

Prove that if T is a Hermitian idempotent and its nullspace 9 is neither {0} nor §, 
then T is a projection of § onto Jt. 

Why is an automorphism of § defined by a unitary operator? 

Let ἢ = 9ὲ, BM, Φ --- Φ Mi, be a decomposition of H into closed linear sub- 
spaces with Wt, 0 Wt, = {0}, jH#k. Let P, be the projection of § onto Mi,. 
Show that P,P, = O and P, + P, Ἔ δ τῆν + Ρ, =f, 

If {0} < M, < M, < H and P,, P, are the corresponding projections, find P,P, 
and P,P;. 


Show that a projection P has the characteristic values 0 and 1 (P # O, J) and no 
others and find the characteristic vectors. 


BANACH ALGEBRAS 


This is essentially a preview of things to come. Some basic properties are stated and 
proved but the more sophisticated parts of the theory are to be found in Chapter 9. 


Definition 2.6.1. A Banach algebra 8 (B-algebra for short) is a Banach space 
in which a notion of element multiplication is defined subject to the following 
conditions : 
(M,) For each ordered pair of elements x and y in ® there is a product xy € B. 
(M,) Multiplication is associative: x(yz) = (xy)z. 
(M3) Multiplication is distributive with respect to addition: 

x(y + z) = xy + Xz, (x + y)Z = xz + yz. 
(M,) Multiplication commutes with scalar multiplication: 

(ax)(By) = («B)(xy). 
(Ms) ||xyl] < [xl llyll- 


Multiplication is not necessarily commutative. If 


xy = yx, VX,y, (2.6.1) 


we speak of a commutative B-algebra. 
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An algebra may have a unit element e such that 


ex = xe=x, Vx. (2.6.2) 
Since οὖ = e, condition (M;) implies that |e] 1. We shall assume 
le] =1. (2.6.3) 


Actually this assumption does not restrict the generality for it may be shown that 
if the original norm does not have this property, then a new norm may be found in 
terms of which (2.6.3) holds and, what is essential, the two norms are equivalent in 
the sense that they induce the same topologies in the space. 
If the algebra has a unit element e, then some elements xe B may have 
inverses in B. Here y is said to be the inverse of x if 
xy = yX =e, (2.6.4) 
and we write 
y=x lt. (2.6.5) 
It should be noted that an element x ε 8 has at most one inverse so the inverse is 
unique if it exists. If x has an inverse, it is said to be invertible or to be regular. 
Elements without inverses are called singular. If 8 = C, the field of complex num- 
bers, then all elements 40 have inverses. On the other hand, if 8 = Mm, n>, 
then some elements (matrices in this case) have inverses while others have not. 


Theorem 2.6.1. The set of all regular elements of a B-algebra forms a group © 
under multiplication. 
Proof. If x and ye6, then 
xyy ‘x"'=e, y'x'!xy=e (2.6.6) 


so that xy € © and has the inverse y"'x~! in G. Also e € G and if x € G, so does x7! 


since (χ 1) } = x. Jf 
Theorem 2.6.2. If 8 is a B-algebra, then all elements x in the sphere 
Ix — el] «1 (2.6.7) 
belong to ©. ' 
Proof. The geometric series 
e+ (e—x)+(e—x)?+--+(e—x)"4+-. (2.6.8) 


converges in norm to an element y of 8 if (2.6.7) holds (why?). If the series is 
multiplied either on the left or on the right by 


X = e — (e — x), 
then multiplication may be carried out termwise, and if the terms are collected, 
then all powers cancel except e so that y is the inverse of x and xeG. Ε 


Theorem 2.6.3, In the normed metric © is an open point set. 


Proof. tis to be shown that if xj eG, then there is a neighborhood of x, also in 
&. This follows essentially by the argument used in proving the preceding theorem. 
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Note that 
Χ = Xq — (Xp — X) = Xp [€ — Χορ “(Xp — Χ)]. 


The first factor in the last member is known to have an inverse while the second 18 
invertible for 
Xo “(Xo — x)|| «1. 
This holds if, for instance, 
IIx — Xoll < [ixo 111 ΄. 
This set is a neighborhood of xo. ff 


The following result is due to I. M. Gelfand. The proof will be given in 
Section 9.1. 


Theorem 2.6.4. If x € 8, then 
lim {χη = r(x) (2.6.9) 


n-> co 


exists and Ὁ < r(x) < ||x|. 


This number r(x) is known as the spectral radius of x for reasons that soon will 
become evident. Here we cannot exclude the possibility that r(x) = 0 for x τὶ 0. 
In analogy with the matrix case we say that an element x of B is nilpotent of degree k 
if there exists an integer k such that x“ = 0, x*~* τ 0. Such an element is singular 
and r(x) = 0. But there are other possibilities. In the operator algebra €(C[0, 17} 
(= linear bounded transformations from the space of functions continuous on 
[0,17 into itself) we may take the very simple operator 


TU f](t) = ] f(s) ds. (2.6.10) 


It may be shown that r(T) = 0. An element ᾳ of 8 such that r(q) = 0 is said to be 
quasi-nilpotent or a topologically nilpotent element. Any such element is singular. 
Various types of singular matrices occurred in Section 1.4 such as divisors 
of zero and idempotents. Such elements also occur in general B-algebras. An 
element a # 0 is a divisor of zero if there exists an element Ὁ ¥ 0 such that either 


ab=0 or ba=0. (2.6.11) 


Both a and b are obviously singular; the contrary hypothesis would imply that the 
zero element is regular. 


An element x such that 
x? =x (2.6.12) 


is said to be idempotent. If x # 0, e, then x is a divisor of zero and hence singular. 
This whole discussion of regular and singular elements makes sense only if 8 has 
a unit element. In Section 9.1 we shall encounter alternative definitions for 
B-algebras without unit element. Until then any B-algebra under consideration is 
assumed to have a unit element. 
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In such a B-algebra the regular elements are particularly important, so we shall 
consider the family of elements 


X, = Ae — x, x fixedin 8, ͵ (2.6.13) 


with respect to regularity. Here 1 is a complex variable. The values of Δ fall into 
two mutually exclusive classes according as x, is regular or singular. The first 
class is called the resolvent set and denoted by p(x), the second the spectrum of 
x denoted by o(x). We recognize these concepts from the matrix case as well as 
from operator algebras. 


Theorem 2.6.5. The resolvent set is an open unbounded set in the complex 
plane and contains 
= |A| > r(x) (2.6.14) 


as a subset, where r(x) is the spectral radius of x. The spectrum is a closed 
bounded set confined to the disk 
[A] < r(x) (2.6.15) 


and there is at least one point of a(x) on the boundary. 


Proof. That p(x) is open follows from Theorem 2.6.3. For large values of |A| we 
have 
R(A, x) = Ge— χ) b=eA 2 + xd eee px ia... (2.6.16) 


See Lemmas 1.5.1 and 2.4.1 for matrices and operator algebras, respectively. The 
series converges in norm for A satisfying (2.6.14) and diverges for [1] < r(x) since 
the norms of the terms are unbounded. In the convergence case we multiply the 
series, right or left, by Ae — x; the resulting product is 6, so that the series represents 
R(A, x) for [A] > r(x). 

It is clear that the spectrum must be confined to the disk (2.6.15). It cannot be 
confined to a smaller disk with center at the origin of the A-plane. To prove this 
rigorously we need basic properties of B-valued holomorphic functions which will 
be developed in Chapter 8. Such functions are representable by Cauchy integrals, 
Taylor’s and Laurent series under essentially the same conditions as Cauchy 
holomorphic functions. Now R(A,x) is locally holomorphic with values in 8. 
In particular, R(A, x) is holomorphic for 


[A| > r = max [|[α{; « € o(x)] 


and as such admits of a unique power series expansion in 1/A convergent in norm for 
|4| > r. This expansion must coincide with (2.6.16) and the latter must converge 
for |A| > r and in no larger region. It follows that r = r(x) and 


r(x) = max [|e]; « € o(x)] (2.6.17) 
as asserted. In particular, the spectrum is non-empty. ἢ 


Further study of the resolvent is postponed until Chapter 9. 
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EXERCISE 2.6 

1. Verify that if X is a B-space and ©(X) the set of all linear bounded transformations 
from X into itself, then ©(X) is a B-algebra. 

2. Prove that if x is singular, so is any constant multiple of x, and use this to prove that 
the set of singular elements is connected. 

3. Is the resolvent set p(x) necessarily connected? 

4. Why is the spectrum a closed point set? 

5. Prove that χε 8 can have at most one inverse. 

6. If x has an inverse, show that x”, 7 positive integer, has the same property and verify 
that (x")~? = (κ ἢ)". 

7. Let q be a nilpotent of degree k > 1. Show thate + q has an inverse and compute it. 
Note that |/q|| need not be <1. 

8. Under the same assumptions, for what values of the complex number ζ does the series 
>, πί + q)” 6” converge in norm? 

9. Why does (2.6.8) converge in norm in the sphere (2.6.7)? 

10. If Tis defined by (2.6.10) prove that ||7"|| < (m!)~? and that r(T) = 0. 

11. Suppose that λ € p(a), ἃ Ε B, he B, and that « is a small complex number. Use the 
technique of the proof of Theorem 2.6.3 to develop R(A, a + oh) in powers of α. 
Note that a and h need not commute. The expansion is of the form 

R + aRhR + --- + a’ RARh...hR + ---, 
where R = R(A, a) and the mth term contains n factors ἃ and hand an - 1 factors R. 
12. Let a and ἢ be arbitrary non-commuting elements of 8 and find for positive 


integers ἢ 1 
lim — [(a + ah)” — 4. 


aro ἃ 
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3 SOME SPECIAL LINEAR SPACES 


The discussion in the preceding chapter suffered from the lack of illustrative 
examples. In particular, we were short on abstract spaces of infinite dimension. 
The spaces C", Wt,, and €(C") which served as our mainstay are finite dimensional. 
They were studied at some length in Chapter 1, where they served to introduce the 
general concepts elaborated in the Chapter 2. We shall now give some 
specific instances of linear vector spaces of infinite dimension. In the present 
chapter the examples will be of a fairly elementary nature and do not require much 
background on the part of the reader. The next chapter is devoted to Lebesgue 
spaces and will require more skill and preparation. 

The present chapter has three sections: Sequence spaces ; Continuous functions; 
and. Functions of bounded variation. 


3.1 SEQUENCE SPACES 


These spaces are concerned with mappings of the positive integers Z* into one of the 
spaces C or R, more generally into a complete metric space. 

The concept of a sequence involves the ordering of the natural numbers. These 
numbers are often taken for granted; the German mathematician Leopold 
Kronecker (1823-91) maintained that these numbers were created by God and 
everything else in mathematics had been achieved by man. This reverence for the 
natural numbers got a jolt in 1889 when the Italian logician and mathematician 
Guiseppe Peano (1858-1932) devised a set of postulates to characterize their 
properties. This event marks the beginning of a new era in mathematics, an end 
to the rough-and-ready pioneering period of the Calculus and a return to the 
axiomatic method of the Greeks. Peano dealt with an undefined set of elements Z 
and an undefined notion of immediate successor of an element x of Z subject to the 
following conditions: 


(Z,) Each element x of Z has a unique immediate successor x* ε Z. 
(Z,) Ifx* =y*, thenx=y. 
(Z;) 1¢€Z and there is no xe Z such that x* =1. 


(Z,) Any subset of Z that contains 1 and the successor of each of its elements 
coincides with Ζ. 


It is clear that the set Z* of all positive integers gives a realization of such a set 
Z provided n +1 is taken as the immediate successor of 7. 
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The fourth postulate implies the principle of induction, which we state as a 
theorem without a proof. 


Theorem 3.1.1. Suppose that for each natural number n there is a proposition 
P(n). Suppose that P(1) is known to be true and that P(k*) is true whenever 
P(k) is true. Then P(n) is true for all n. 


After these preliminaries we turn to sequences. Let X be an abstract space and 
consider a mapping T of Z* into X. That is, we consider a collection of ordered 
pairs 

{(n, x,):n€ Z*, x,€ 3}. (3.1.1) 


Here the domain of T is the set of natural numbers ordered by the notion of 
immediate successor, the range is a subset of ¥. Note that the ordering of the 
domain induces an order of the range. The ordered range is the associated sequence 


5 Ξ ae ee a ee es a (3.1.2) 


for short. Note that the sequence {x,} may also be regarded as a realization of the 
Peano postulates provided we replace 1 by x, and define x,,, as the immediate 
successor of X,,. 
If X is a linear vector space, we can define algebraic operations on sequences 
with range in X. The sum of 
a= {a,j}, b= {bf 
is taken to be 


a+ b= {a, + 5,}. (3.1.3) 
Similarly, a constant times a sequence is the sequence 
aa = {aa,}. (3.1.4) 


If X is not merely a linear vector space but actually an algebra, the product of two 
sequences may be defined as the sequence of the products of corresponding 


components 
ab = {a,b,}. (3.1.5) 


The set S(X) of all sequences {x,} with range in X becomes a linear vector 
space provided X is such a space and the algebraic operations are defined by (3.1.3) 
and (3.1.4). If ¥ is an algebra, so is S(X) if multiplication of elements is defined by 
(3.1.5). It should be observed, however, that other definitions of element multipli- 
cation are feasible and more appropriate for the applications. Some instances are 
given in Exercise 3.1. 

If ¥ is an algebra with unit element e and multiplication is defined by (3.1.5), 
then S(X) also has a unit element, namely the vector e all components of which 
equal e, and now 

ae = ea = ἃ. (3.1.6) 


Multiplication is commutative in S(X) if it is commutative in X. The element 
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ΧΕ S(X) has an inverse provided all components x, are regular elements of 3, in 
which case 
Sih | (3.1.7) 


We have defined an algebraic structure for S(X). Is it also possible to define a 
topological structure? Here it seems reasonable to restrict oneself to the case where 
X is a complete metric space. We can then also define a distance d(a, b) between two 
sequences a and b; in fact, this can be done in infinitely many ways. Following 
Maurice Fréchet (1906), we set 


 y—n As Dy) 

d(a, b) = 2 2 1+ da,,5,) 
The multipliers 2~" may be replaced by any sequence of positive numbers {a,,} 
such that 2a, is convergent. It is clear that d(a,b) > 0 and d(a, b) = Oiffa=b. 
Now two sequences a and b are equal by definition iff corresponding components 
are equal, i.e. a, = b, for all n. Further, d(a,b) = d(b,a). This takes care of 
postulates (D,) and (D3) for distance. It is not clear that the triangle inequality 
(D3) is satisfied. That it holds is a simple consequence of the triangle inequality in 
X together with the elementary inequality 


(3.1.8) 


St+t Ss t 
-------------  - ; 
l+s+f l+s l+t 


(3.1.9) 


valid for all positive s and t. The verification is left to the reader. 

The space S(X) is not of much general interest. The special case X = C, i.e. the 
algebra of complex-valued sequences, is more interesting, but to obtain concepts 
of greater importance we have to specialize still further. The following 
possibilities are noteworthy: 


1) bounded sequences, i.e. |a,| bounded; 
1) lima, exists; 
11) Din=1 l,l? converges for some fixed p > 1. 


In all three cases the components a, are complex numbers. 

Sequences which satisfy condition (i) form the space ἰω of bounded sequences. 
Sequences satisfying (ii) form the space c of convergent sequences. The theory of 
infinite series is concerned primarily with the characterization of the elements of the 
space c. Finally, the space /, is the set of all sequences satisfying condition (iii). 
Each of these spaces is linear. They are also algebras, in the case of /, Without unit 
element, since the vector 1 belongs to no [space with a finite p. It does belong to 
ἶ.. however. 

The metric defined by (3.1.8), of course, also makes sense in these special cases, 
but it is not well adjusted to any one of them. In each of the three cases listed above 
it is possible to introduce a norm. In cases (i) and (ii) the so-called sup-norm is 
appropriate: 

jal] = a dnl. (3.1.10) 
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In case (ili) we use the /,-norm 


lal, = ΙΣ jal? (3.1.11) 


For p = 2 this is the natural generalization of the Euclidean norm. The case p # 2 
was also considered in Section 1.6 for finite sequences, namely the coordinates of a 
point in C”. 

That (3.1.10) defines a norm is clear. The reader will have no difficulty in 
verifying conditions (N,) to (N;). This statement holds also for p = 1 in case (iil). 
The case p >1 is different. First it is not obvious that the space is linear as 
asserted, 1.6. ἃ Εἰ,» bel, impliesa + bed. Secondly, the triangle inequality must 
be verified. The reader should also prove completeness of /, in the normal metric. 

The first step is simple enough. If s and ¢ are non-negative 


s+t<2 max (5, 1), 
so that 
(s + t)? < 2? [max (s, t)]? < 2? [s? + {}]. 
Hence 


Σ᾿ la, + 5,|? < y [lanl + 1δ,,1’ < 2? Σ᾽ [[α.}Ρ + |b?) (5.1.12) 


and ἃ Εἰ,» be/, implies a+ be [,. 

For the second step we need HGlder’s inequality (Otto Hélder, 1860-1937). 
This inequality involves the notion of conjugate or adjoint sequence spaces. The 
spaces 1, and I, are conjugate if 


—+—=1, Il<p. (3.1.13) 
ρ q 
This means that /, is se/f-conjugate and we extend the definition to admit /,, as the 
conjugate of /,. 


Theorem 3.1.2. Suppose that {a,\ €1,, {b,} εἰ, 1 < p< οὐ. Then {a,b,} € |, 


and 
00 00 1/p ( © 1/q 
Σ laxbal < {5 tau} |S Walt) (3.1.14) 
n=1 n=1 n=1 
Equality holds here iff there exists a fixed number > 0 such that 
la,|? = μίδ,14, Va. (3.1.15) 


We shall use 


Lemma 3.1.1. If A and B are positive numbers and 0 « 5 <1, then 


ASB!~* < As + Β( — 5) (3.1.16) 
with equality iff A = B. 


3.1 SEQUENCE SPACES 89 


Proof. \f A = B, there is nothing to prove. Suppose A # B. Then the arc 
{-- ΑΒ O<s<l, 


in the (5, t)-plane is concave upwards and hence lies below the chord joining the 
endpoints the equation of which is 


t= As+ BU -- 5) 
and (3.1.16) follows. fj 


Proof of Theorem 3.1.2. In (3.1.16) we set 
l 
ms Ια. "Ὁ, B= [5,17 Ss; 
ρ 


and obtain 1 Ι 
|a,,,| Ἐξ .- Ια, ἢ ΤΡ ΞΞΞ 6,4. 
Pp q 


Summing for 1 we get 


00 


οο 1 l οο 
Y abl « --- Σ᾿ lal? + — ΠΣ [byl (3.1.17) 
n=1 P n= 4 n=1 
This inequality is of some interest in itself and it provides an alternative proof that 
8. Εἰ» be/, implies abe/,. Note that equality here holds iff 


Ια, ἢ = [δι,, Va. 


The desired inequality (3.1.14) is obtained from (3.1.17) by a peculiar artifice. 
In (3.1.17) we replace a, by a,w~' and b, by b,w for all n. Here w is an arbitrary 
positive number independent of. The left member is unchanged but the inequality 
now becomes 


οο 1 00 l o@) 
Y |a,b,| <—w? ¥ jal? + —wt ¥ [6.4 (3.1.18) 
= Ρ n=1 q n=1 
with equality iff 
la,/? = wet Tb 14 Wan. (3.1.19) 


Now w is at our disposal and we may choose it so that the right member of 
(3.1.18) is as small as possible. To simplify the notation let us write 


Σ Ια, = 


n=1 n 


Pal? = T #0. 


M18 


Thus the problem is to minimize 


] 1 
F(w) = — Sw ? + — Tw’. 
Pp 4 
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The minimum is reached for 
S \ 1/(et+aq) 


and is found to be 
Ρ 1 1 


sofa peta _ gp Τὰ 
since p + g = pq. This is the right member of (3.1.14). 
If equality is to hold in (3.1.14), it must hold in one of the secondary inequalities 
(3.1.18), ie. there exists a w such that (3.1.19) holds. Conversely, if (3.1.19) holds, 
then we have equality in (3.1.18) as well as in (3.1.14). a 


Next we have Minkowski’s inequality. 


Theorem 3.1.3. We have 


1/p 1/p 


0 1/p 00 oo 
Σ Πα + Wat?) < {Stoel + (Sel Ο120 
if either side is finite. 


Proof. Formula (3.1.12) shows that both sides are finite ifa and be/,. We have 
then for p > 1 


(oe) 


Σ [la,| + |b,|]? = Σ [la,| + [bt Ilan! a \b,|]?-2 


n= =1 


= 2 lanI[lanl - [6.17 + Σ) ιδ.|Π|α,} + [δ,11Ρ. 


We use HGlder’s inequality on each of the two series in the last member. We take 
the inequality in the form 


οο oo 1/r oO l/s 1 1 
Σ mr < {> | {> ον] ᾽ -- Ἐπ Ξ|ὶ, x, = Ὁ, y, = 0. 
n=1 n= n=1 
We take first 
Xn = lanl γ, Ξ [Ἰαμ! +161", r=p, 554, 
and obtain 


oe) 1/q 


ΣΟ μη αι + Vad? < [Σ tas} ” SS Clea + Val 


n=1 
The second series requires only to interchange a, and b,. Adding the two estimates 
gives 


Σ Clea + 16” < { ¥ παρ + war} {( S tau) + (Se) 


n=1 n=1 


Simplification leads to (3.1.20). Ε 
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Corollary. ||a + Ὁ]; < |lall, + ||bl,, 1 <p. 


We can rewrite (3.1.14) as follows: 
llabll, < |lall, {0} (3.1.21) 


assuming the right member to be meaningful. The special case p = q = 2 is 
Cauchy’s inequality. 

The space /, is an inner product space, in fact a Hilbert space, the first 
Hilbert space of infinite dimension that we have encountered so far. With obvious 
notation we define 


(,5) -- Σ᾽ ae: (3.1.22) 
n=] 


The proof that this actually defines an inner product is left to the reader. The series 
is absolutely convergent for every a and b in /,. By (3.1.21) 


(a, b)} < [{4]}2 || bl}2. (3.1.23) 


A complete orthonormal system is furnished by the unit vectors u ; where u, has a 
one in the jth place and zeros elsewhere. If 
(x, u,;) = X,, (3.1.24) 


then the partial sums 
n 
», Xu 
j=1 
converge in the metric of the space to the limit 
Σ, ju; = {2}, (3.1.25) 
j=1 


which is an element of the space since 7, |%,|? is convergent by Bessel’s 
inequality. 

Linear functionals were mentioned in Sections 1.1 and 1.6 and discussed for 
B-spaces in Section 2.3. Here we shall give a complete determination of the linear 
bounded functionals on /,. The construction is essentially based on (3.1.21). We 
recall that a functional is a function on vectors to numbers, F(x). It is linear if the 
domain is a linear vector space ¥ and if 


F(x + y) = F(x) + Fy), (3.1.26) 
F (ax) = «oF (x). (3.1.27) 


If X is a B-space, the functional is bounded if there is a finite M such that 
|F(x)| < M||x||, Vx. Its norm is then given by 


ΠῚ} = ues ΕΓ... (3.1.28) 
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If now ἃ = 1,, | < p < οὐ, we proceed as follows. To start with, let b be a 
fixed vector in /,, the conjugate sequence space, and let x be an arbitrary vector in 
/. If : 


Ρ 
Χ-- {χ:}; b= {b,}, 
we form the infinite series 


Σ᾽ byX_ = F(x, b). (3.1.29) 
n=1 


By (3.1.21) the series is absolutely convergent for all x ε /,. F(x, b)isa function from 
vectors to numbers defined on /,. It is a linear functional on /, and by (3.1.21) it is 
bounded and its norm is at most equal to ||b||, since 


IF (x, b)| < ||bll, |x. (3.1.30) 


To prove that the norm is exactly equal to ||b||,, observe that it is possible to choose 
a vector x €/, with ||x||, ΞΞ 1 such that 


F(x, b) = |[bl,. (3.1.31) 

These conditions will be met if there exists a fixed w > 0 such that for each n 
|x,1? = wlb,|? and b,x, > 0. (3.1.32) 
If b, = Ὁ for some particular value of n, set x, = 0. For other values of n we choose 


_ {byl 
b, 


[xn] = wei? |yI4?, Xn IXnl (3.1.33) 


and, finally, we adjust w so that ||x||, =1. This gives a vector x such that (3.1.31) 
holds. It follows that 
IFC, Β}} = bli, (3.1.34) 


as asserted. Thus we have shown that to each fixed vector be /, corresponds a 
linear bounded functional F(x, b) on /,. It is natural to ask if all linear bounded 
functionals on /, are of this form. The answer is in the affirmative. 


Theorem 3.1.4. If F(x) is a linear bounded functional on I,, then there exists a 
vector bel, such that (3.1.30) holds. 


Proof. Consider the unit vectors u, which obviously belong to /,. We recall that 
u, is the sequence with a one in the kth place and zeros elsewhere. Suppose that 


F(u,) Ξ- δι, k =1,2,3,.... (3.1.35) 


This determines a vector b = {δι} and it is to be shown that be /, and 


F(x) = Vs pty: (3.1.36) 
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Restrict x temporarily to the subspace X, where x, = Ὁ fork > N. This is a linear 
subspace on which F(x) is linear so that 


N 
F(x) = Σ DXi ΧΕ Xy. 
k=1 


We now proceed as in the proof of (3.1.34). We choose x € ¥, such that 
bX, = |b, x;1, |x;,|? = |b, |%, k = 1, 2s sees Ν. 
This can be done and by (3.1.21) applied to Xy 


N N 1/q 
POSS bese {> ot" Ix! 
k=1 k=1 


By assumption F(x) is a bounded functional, i.e. there exists a finite M such that 


IF(x)| < M|[x|, 
for all xe/,. This implies that 


N 1/q 
| Σ al < M. 
k=1 
Since this holds for all N, it follows that be/, and 
F(x,b) = > b,x, 
n=1 


is a linear bounded functional on /,. Here F(x) and F(x, b) coincide on each sub- 
space X,. To finish the proof we need only observe that a linear bounded functional 
on a B-space is continuous in the normed topology. For /, this is expressed by 


Lemma 3.1.2. If G(x) is a linear bounded functional on I,, then 
lim ||x, — Xol|,=0 implies lim G(x,) = G(X). (3.1.37) 

The proof is left to the reader. 

We now take an arbitrary x = ἰχ,} εἰ, and define a sequence of vectors xy 
where the first N coordinates of xy coincide with those of x and the remaining 
ones are zero. Here 

lim ||xy — x||, = 0. 
N- 0 
Hence 
lim F(x,) = F(x), lim F(xy, b) = F(x, b). 
N~ 0 ἐν No 
Since F(xy) = F(Xy,b) for all N, their limits are also equal and F(x) = F(x, b) 
for all x. fj 


\ 


The cases p = 1 and p = οὐ are left to the reader. 
It remains to say a few words about linear bounded transformations from /, 
into itself. We can construct such transformations by an extension of the method 
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used in Section 1.3, but here new features arise. We normally have to use infinite 
matrices, say 
A ΞΞ (aj, (3.1.38) 


and form o 
yj= ) QyX, J=1,2,3,.... (3.1.39) 
k=1 


Here we have to satisfy two requirements. For every ΧΕ ἐν the infinitely many 
series for the y,’s must converge and the resulting vector {y,} must be an element 
of /,. This imposes heavy restrictions on the coefficients a;,. Let us just state one 
result as a sample: 


Theorem 3.1.5. Let 1< p< οὐ, let 1/p +1/q =1, and let 


|All, = | Σ bo jaa) | (3.1.40) 


j= 
exist as a finite number. Then (3.1.39) defines a linear bounded transformation 


from I, to 1, and 
[»]}}» < Ill, [Χ];- (3.1.41) 


The proof is based on Hélder’s inequality and the details are left to the reader. 


EXERCISE 3.1 


1. As an illustration of the principle of induction prove that for all x 
17427 4... + n* =4n(n +4) (n +1). 

2. Verify (3.1.7). 

3. An alternate way of defining multiplication of sequences a = {a,}, b = {6,}, 
where n = 0, 1, 2, ..., is to set ab = {c,}, where 

C, = aod, ae a;b,-14 gs a,bo. 

This may be called the Cauchy product in analogy with the Cauchy product for 
infinite series. It is understood that a, and b, belong to a non-commutative 
B-algebra with unit element e. Show that (e, 0, ..., 0, ...) acts as unit element in the 
Cauchy sequence algebra. 


4. If multiplication of sequences is defined as in the preceding problem and the sequence 
algebra is /,, show that ||ab||, < |lall,||b||, and determine when equality holds. 


5. Another definition of multiplication is based on the so-called Dirichlet product. Here 
ab = {d,}, where 
d, = Σ a by) 5, 0< an”, 


and the summation is extended over the divisors of n. Is there a unit element? 


6. Discuss the analogue of Problem 4 for the Dirichlet product. 


3.1 
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. Prove that s/(s + 1) is increasing for s > 0 and verify (3.1.9). 


8. Prove the triangle inequality if d(a, b) is given by (3.1.8). 


9. Fill in missing details in the proof of Theorem 3.1.2. 


13. 
14. 


15. 


16. 


17, 
18. 
19. 


20. 
21. 
22). 


. Under what conditions on the vector b does (3.1.29) define a linear bounded 


functional on /,? 


. Same question for /,. 


. Construct linear bounded functionals on c. Show that lim x,, is such a functional if 


Ky 
Prove Lemma 3.1.2. 
Show that ἢ 
y= = + Xo ++ Ἔ χορ, n=1,2,3,... 


defines a linear bounded transformation from the space c into itself. Show that limits 
are preserved, i.e. lim x, = s implies lim y, = s. This fact is known as Cauchy’s 
First Theorem. It is the transformation that defines so-called (C, 1)-summability 
Jafter Ernesto Cesaro (1859-1906)], also known as the arithmetic means of order one. 
See Chapters 14 and 15. 

Show that the transformation in Problem 14 is also linear and bounded from /,, to 
itself. Show that lim y, exists for the sequence {x,}, where x.,-, = 0, x, =1, 
k =1, 2, 3, .... What does this fact signify? 

Show that the transformation of the preceding problems is linear but not bounded 
on /,. Exhibit a sequence {x,} Εἰ, such that the transformed sequence {y,} is not 
in /,. 

Prove Theorem 3.1.5. 

When is the form (Ax, y), x, y€/, Hermitian? 

Find sufficient conditions on the a jk SO that (3.1.29) defines a linear bounded 
transformation from /, to itself. 

Prove that /, is separable for 1 < p < oo. 

Prove that /,, is not separable. 

Is the space c separable? 


The remaining problems deal with certain mean values of numbers and are related to 


formulas for integrals in Section 4.4. See also Chapter 15, where these notions are 
elaborated. | 


23. 


Given n positive numbers a,, a2, ...,a,. For 0 <r set 


n l/r 
Σ cay] 


1 
nN j= 


M,{a) = | 


and show that 
min a; Ma) < maxa; 


with equality iff all a; are equal. 
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24. Show that M,(a) < M,(a) for r < s with equality iff all a j are equal. 
25. Show that lim M,(a) = max a,. 


roo 


3.2 CONTINUOUS FUNCTIONS 


The next system discussed in this chapter is the set of complex-valued functions 
continuous in a finite closed-interval [a,b]. This set is denoted by C[a,b]. The 
reader learnt in the Calculus that the sum of two continuous functions is a con- 
tinuous function, so is a constant multiple of a continuous function as well as the 
product of two such functions. In other words, C[a, b] is an algebra if we define 


(f+ g(t) =f) + g(t), (3.2.1) 
(af )(t) = af (t), (3.2.2) 
(fot) = f(t) g(2). (3.2.3) 


This algebra is commutative and it has a unit element, the function 1(t), which 
is identically equal to 1 in [a, b]. There are regular elements: we have clearly 


SOLFO = FOV FO =10 (3.2.4) 


iff f(t) 4 0 in [a,b]. All functions f(t) which assume the value 0 anywhere in 
[a,b] are singular. There are no proper nilpotents. There is a zero element 0(¢), 
the function which is identically 0 in [a,b]. These functions 0(t) and 1(t) are the 
only idempotents. On the other hand, there are divisors of zero since 


f(t) g(t) = 0) 


can hold for all ¢ without either factor being O(¢). 

In this case it is quite easy to characterize the spectrum and the resolvent set of 
one of the elements of the algebra. We recall that the spectrum is the set of complex 
numbers ἃ such that 


Al(t) — f(t) 
is singular. This is the case iff this function can take on the value zero. Hence 
σ 71 = RIS], (3.2.5) 


the range of f. This is a bounded closed connected set in the complex plane. The 
resolvent set is the complement of the spectrum and hence unbounded and open 
and need not be connected. The resolvent of f is simply 
] 
Κ(λ,32)(ὃ = ----- 3.2.6 
1-70 os 
This is a continuous function of t for any fixed Δ in the resolvent set. For any fixed 
t it is a piece-wise holomorphic function of A. 
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The function f(t) = exp (it) has a range which is an arc of the unit circle or the 
whole unit circle according as ὦ — a < 2n or >2z. In the latter case the resolvent 
set has two disjoint components, the interior of the unit circle and the exterior. 
The resolvent is holomorphic in each component but nowhere on the unit circle. 

We can endow C[a, b] with a metric based on the sup-norm 


Ifil = sup [fo (3.2.7) 


agtsb 


if we set 
Af, 9) = lf -- gl. (3.2.8) 


Actually we could replace “sup’’ by “‘max”’ in (3.2.7) since we deal with a finite 
closed interval and a continuous function attains its maximum in such a set. 

We recall another classical theorem concerning continuous functions: A 
sequence of functions in C[a,b] which converges uniformly with respect to ¢ in 
[a, b] converges to a continuous function. Let us translate this statement into our 
new language. There is given a sequence { f,! < C[a,b]. Uniform convergence 
implies and is implied by 

lim || fn — fall = 0 (3.2.9) 


when m and n tend to infinity independently of each other. This says that {7} is a 
Cauchy sequence in the space and the classical convergence theorem asserts the 
existence of an fj € C[a, b] such that 

lim || fo —Jf,|| = 0. (3.2.10) 
It follows that the normed algebra C[a, b] is complete in the metric and we are 
dealing with a B-algebra since clearly 


fol < fil 19]. 


Continuous functions on a closed interval are uniformly continuous. This 
property leads to the notion of a modulus of continuity. Set 


LAS) = sup | f(t) — f(t2)|, ty, ty Ε La, δ]. It, ae t| S h, (3.2.11) 


This is called the modulus of continuity of f. As a function of its first argument p(h; f) 
is an element of (ΤΌ, b — a]. Further, μί(θ; 7) = 0 and u(h;f) is non-decreasing and 
subadditive in h so that 

uh, + hos f) < wy f) + wha; 7). (3.2.12) 


The verification of these properties is left to the reader. 

What we have just discussed is uniform continuity with respect to ὦ for a fixed 
element f of C[a,b]. But we can also have uniform continuity for a subset of 
elements {/,}. Such a set is said to be equicontinuous if, given any ¢ > 0, there 
exists an ἢ > 0 such that 


t1,t,e€ [a,b], Ι΄ —t2]<A implies |[f,(4,) -- ἀ(2)}} < 8 (3.2.13) 


for all « in the index set. If the latter is finite, equicontinuity of the set { f,} holds 
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trivially. If, however, the set is infinite, a restriction is imposed on the 
corresponding set of moduli of continuity pw(h; f,). Set 


u(h) = sup p(h;f,). (3.2.14) 


We have (0) = 0, but normally u(t) = +00 forh > 0. The function p is bounded 
on the interval 0 « ἢ « ὁ — a iff μ( — a; ζω) is a bounded function of «. If, in 
addition, 
lim p(h) = 0, (3.2.15) 
hy 0 


then μ() is continuous and subadditive as well as non-decreasing. In order to satisfy 
(3.2.13) we merely choose ἢ so that μ() = é and the equicontinuity follows. 

The Bolzano-Weierstrass theorem holds in C”: any bounded infinite set in C” 
has at least one cluster-point which is an element of C”. The theorem does not hold 
in C{a,b]. To illustrate this, take a = 0, ὃ =1, and f, = t". The only point-wise 
limit of the sequence {t”} on [0, 1] is the discontinuous function f(t), which is 0 for 
0<+t<1and1fort=1. The sequence does not converge uniformly; it is not a 
Cauchy sequence, nor does it contain a subsequence with the Cauchy property, 
and the limit is not in C[0, 1]. The sequence is uniformly bounded in [0, 1] but it 
is not equicontinuous (why?) and this is decisive. 

The following theorem is due to Cesare Arzela (1847-1912). It is based on the 
notion of equicontinuity introduced by Guilio Ascoli (1843-96). See references at 
the end of the chapter for a proof. 


Theorem 3.2.1. Let F = {f,} be a family of functions in Cla,b] which is 
(1) uniformly bounded || f,|| < M,V «, and (2) equicontinuous. Then each sequence 
in F contains a uniformly convergent subsequence. 


It is clear that the limit of the uniformly convergent subsequence is in C[a, δ], 
but it need not be an element of F. If F is closed in C[a, 6], then F is said to be 
(sequentially) compact, otherwise just conditionally compact. 

There are any number of interesting families of functions with special 
properties belonging to C[a,b]. Among these we single out (1) the family L of 
piece-wise linear and continuous functions, and (2) the family P of polynomials. 
We say that a family F is dense in the space C[a, b] if every fe C[a, δ] is the limit 
in the sense of the metric of a sequence {f,} < F so that 

lim || f — f, || = 0. (3.2.16) 
Both L and P are dense in C[a, b]. For L this is intuitively obvious, but we shall 


give a proof since this brings out an important property of the function ¢t — [{], 
which will be used below in one of the proofs for P being dense. 


Theorem 3.2.2. The family L of piece-wise linear continuous functions, 
restricted to the interval [a,b], is dense in (ἴα. δ]. 
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Proof. We may take a = 0, b = 1 since the mapping 
s=a+t(b—a) (3.2.17) 


takes f(s) ε C[a, δ] into f[a + t(6 — a)]¢ C[0,1]. Divide the interval [0, 1] into 
n equal parts and set 


ΕΥΩΞ > A, Ϊ = Ἴ ΕΝ (3.2.18) 


where the constants Ay to A, will be determined. First, note that for any choice of 
the constants L,(t)eL. Since |t| is continuous, so is L,(t). Moreover, in the 


; Ε k+ ἢ 
interval | —, 
n n 


k-1 j n-1 Jj 
L,t)=—- ¥ A,(+- 4) ἘΣ A, (1-4) +A, k=0,1,...,2—1, 
j=0 n j=k n 
and hence is a linear function of tf. 
Next, if f(j/n) = γ; and f(t) ε CLO, 1], we choose the coefficients A j in such a 
manner that 


L,(=) =y, j=0,1,2,...,7. (3.2.19) 


This gives a non-homogeneous system of n +1 linear equations for the n +1 
unknowns Ao, 4,, ..., A, The determinant of this system is seen to be different 
from zero so the A,’s are uniquely determined and not all zero. The function L,, 15 
that unique element of the family L which interpolates f at the division points of the 
partition of [0, 1] into ἡ equal parts. It consequently agrees with f at the division 
points, and since f is continuous in [0,1] we expect L,(t) to give a good 
approximation to f(t) in the whole interval. 

This is proved as follows. We know that fis uniformly continuous in [0, 1]. 
Consider its modulus of continuity u(h; 7) and set w(1/n; f) = ἡ,. This number 
goes to zero as n-» οὐ. Hence in the interval k/n < t < (k + 1)/n we have 


If@) -- yd <n, and |[L,(t) — ygl < (ent — Yel « ἡ,» (3.2.20) 
so that 


f(t) — L,@)| < 2n, (3.2.21) 
uniformly in ¢ as asserted. ἢ 


The corresponding theorem for P is the famous theorem of Karl Weierstrass 
(1815-97). 


Theorem 3.2.3, The family of polynomials in t, restricted to the interval [a, b], 
ὃ —a< οὐ, is dense in C[{a, b]. 
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We shall give two proofs in the text and a third one is indicated in Exercise 3.2 
below. The first proof is due to Henri Lebesgue (1875-1941). It is based on the 
preceding theorem and on 


Lemma 3.2.1. The function x > |x —c], a<c<b, can be approximated 
uniformly by polynomials in the interval (a, δ]. 


Proof. As above, we take a = 0,b =1. We use the fact that the binomial series 
00 1 
Σ (-1* ( k z* = (1 — 2)" (3.2.22) 
k=0 


converges absolutely and uniformly for [Ζ <1. The binomial coefficients stay 
bounded after multiplication by k*/*, hence the convergence. To fix the ideas, 
suppose that 4 <c <1. Then 


Jt — c] = {c? — [οὗ -- (t-— cc)’ ]}*”, (3.2.23) 


where the positive square root is taken. This gives 


It — οἱ = at = Ι _ (: : yy" 


Rem) 5}. ον 


convergent for [1 -- ο] <c or O<t< 2c. The convergence is uniform in this 
interval so that the polynomial partial sums converge uniformly to |t — c| in 
[0, 2c], a fortiori in [0, 1]. If c < 4 we replace c” by (1 — c)? in (3.2.23) and proceed 
in the same manner. a 


First proof of Theorem 3.2.3. If fe ClO, 1], we determine L, as in the proof of 
Theorem 3.2.2 in such a manner that 


If — Δ,(η] < ὁ 


for all tin [0, 1] where ¢ is a preassigned small positive number. Here L, (t) is given 
by (3.2.18) for a suitable choice of the constants A,;. Lemma 3.2.1 shows that each 
of the components in (3.2.18) can be approximated arbitrarily closely in the 
interval [0,1] by polynomials in ¢ with A, approximated by itself. The sum of 
these polynomial approximations multiplied by the appropriate A, gives a poly- 
nomial approximation of L,(t) of any desired accuracy in [0,1] and hence a poly- 
nomial approximation of f(t). Hence P is dense in C[0, 1]. Jj 


The second proof is due to Sergei Natanovié BernStein (1912). 


9.2 CONTINUOUS FUNCTIONS 101 
Theorem 3.2.4. If f(t) ε ([0, 1] set 


5 k 
B(t; f) = Σ Δ) ( : tk — τι (3.2.25) 
Then 
lim B,(t; f) = f(t) 


Sake 6] 


uniformly in [0, 1]. 
The proof requires a lemma. 


Lemma 3.2.2. For f(t) =1, t, 12 the nth Bernstein polynomial is, respectively, 


1 
1,t,t* + — τ — ἡ). (3.2.26) 


Proof. By the binomial theorem 
= "[{n ΤΕ 
(pt+g"= > [1}»᾿ ᾿ (3.2.27) 
k=0 


For ρ τί, 4 =1-—t we obtain B,(t;1) =1. Differentiation of the last identity 
with respect to p and multiplication of the result by p gives 


AN a ela) ( ) kp'q"*. 


Division by n and substitution give B,(t; 1) = t. A second differentiation gives 


n(n —1)(p + 4) 7 = Σ kc — ὴ»" 2,5." 


and n 
BP Oat ay as De (1) Κρ η 


The substitution p = t, g=1—t gives B,(t; ε2) as the last expression under 


(3.2.27). ἢ 
Corollary. We have 
n n k\2 : ᾿ 1 
ἐ -- - ) “Ad -- 5 =— “(1 — ἡ. .2.28 
Dy (i ) ( n ) ( ) n ( ) 3 ) 
Proof of Theorem 3.2.4. The value of B,(t; 1) =1 gives 


Bic -1)=% [Χ(5] - ὦ (Z)ea-or% 229) 


k= 


It is required to show that the difference is uniformly small in absolute value for 
large values of ἡ" and this requires each term in the right member to be small. This 
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is true but not for the same reason for all terms. For k such that k/n is close to the 
particular value of ¢ under consideration | f(k/n) — f(t)| is small since f is 
4 t* (1 — ἢ)" which is 
small. This follows from the Corollary. For a given ¢ > 0 and a given ft, separate 
the values of k into two disjoint subsets, N, and N,. Here 


continuous, for k/n away from t¢ it is the weight factor ( 


Let S, denote that part of the sum in (3.2.28) where & runs over N, and S, the rest. 
Suppose that ὃ is the value of y(h; f) for h = «. Then 


n k _ 4\n-k 
si «ὁΣ (7) a Ὁ: 


since the summation from 0 to n equals 1. 
Next we have 


Sal < 2171 (4) ed — τὴ 


We now resort to (3.2.28) and obtain 


y (: ἘΞ +) " ta -- "πῆς κι — 2). 


N2 


In the left member |t — k/n| > 8 so that 


2, ( ) ἐ(Ι — ty" " « (4 εἶ) 1 


since 1(1 -- ἢ) « Ζ τὴ [0,1]. Hence 
[52 « fl] (2e2n)~* 
|B,(t; f) —f(t)| < ὃ + [0 Δ᾽)". 
Here we can take ε = η΄ 1} (any exponent <4 will do) and obtain, finally, 


Bs) -- 1] < μαι ὃ; f+ Sif lla”. (3.2.30) 
This shows that the Bernstein polynomials are dense in C[O, 1]. Β 


and 


Corollary. Theorem 3.2.3. 


We now turn to the question of the existence of linear bounded functionals on 
the space C[a, b]. That is, we want to find mappings of C[a, δ] into the space C of 
complex numbers such that the mappings are linear and bounded. 
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A trivial but nevertheless important choice of such a mapping is the following. 
Take a fixed number fy with a < ty) < ὃ and set 


F(f) = f(to). (3.2.31) 


This is clearly a linear functional and its norm is seen to be 1 (why?). In fact, the 
functional is not merely linear but also multiplicative since 


F(fg) = F(f) ΕΘ). (3.2.32) 


It will be seen later that every bounded linear and multiplicative functional on 
CLa, 5] is of this form, i.e. there exists a fp in [a, Ὁ] such that (3.2.31) holds. 

But even for the non-multiplicative case this approach is suggestive. Suppose 
that {t,} is a countable set of points in [a,b] and that Σ᾽ Ὁ ς, is an absolutely 
convergent series. Then 


οο 


Σ Cnt (tn) (3.2.33) 


1 


is an absolutely convergent series for any choice of fin C[a, b]. The series defines 
a linear functional on C[a, b] and this is bounded since 


Y, eft] < Ye leal AFL = ΟΜ]: (3.2.34) 


The norm of the functional is <C and equality may very well hold. We can general- 

ize still further using the Riemann-Stieltjes integral instead of infinite series. 
Suppose that g ε BV [a, δ], i.e. g is a function of bounded variation on [a, 6] 

for which see next section. This means that for any partition of [a, 5] by points ἐμ, 


Aa=tb<t)<ti,<-+-<t,=5, 
the finite sum 


Σ lg.) -- σ(ί,-.1}} <M, (3.2.35) 


where M is independent of the partition. 
Consider now an arbitrary fe C[a, b] and a fixed g ε BV[a,b]. We can then 
define the Riemann-Stieltjes integral 


b 
[ (9) dg(s) (3.2.36) 


as the limit of Riemann-Stieltjes sums 


Σ St) Lg(t.) -- 9(-1)]. 
The limit is clearly linear in f, and its absolute value does not exceed 


AAU AR ALe (3.2.37) 


Here the first factor is the total variation of g over [a, b], that is the least value of 
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M for which (3.2.35) can hold. Thus (3.2.36) defines a linear bounded functional 
on C[a,b]. In fact, we have the theorem of F. Riesz (1909). 


Theorem 3.2.5. The adjoint space of ([α, b] is the space BV[a, b]. Every linear 
bounded functional on C[a, δ] is of the form (3.2.36) and its norm is at most 
equal to the total variation of g in [α, b]. Equality holds if g is suitably normalized. 
Every such integral defines a linear bounded functional on (ἴα, b]. 


The uncertainty in the value of the norm is due to the fact that g is not uniquely 
determined by the functional, and this goes back to the fact that in (3.2.36) we can 
modify g in infinitely many ways without changing the value of the integral for any 
choice of fin C[a, b]. That we can add a constant to g is trivial and does not affect 
the total variation. But a function of bounded variation may well have a countable 
set of discontinuities, though of the first kind, so that g(t) has right and left hand 
limits at all points, commonly denoted by g(t — 0) and g(t +0). If for some 
t = ty we have g(t) — 0) ¥ g(t) + 0), we have a point of discontinuity at t = fo. 
The actual value of g(t.) does not affect the value of the integral but it does affect 
the total variation. If g(b — 0) = g(d) and g(t) is right hand continuous at all 
other points, we obtain the normalization of F. Riesz for which the norm of the 
functional equals the total variation of g. We shall not pursue this topic any 
further. 

The next question is how to obtain linear bounded transformation on C[a, b] 
into itself. The possibilities are legion. The following is an example of an important 
class of such transformations. Take a function K(s, t) of two variables, defined and 
continuous for (s, 1) in the square [a,b] x [a,b] and form the integral 


| K(s, t) f(s) ds = g(t). (3.2.38) 


For any choice of fe Cla, b] this is a continuous function of 1 fora<t<b. 
Moreover, 


ial « 1] sup "ἢ IK(s, 1)| ds (3.2.39) 


The integral (3.2.38) is clearly linear in f and the last inequality shows that the 
integral defines a bounded linear transformation. The conditions on K(s, t) may 
be relaxed in various ways. 

Before leaving the continuous functions let us consider various related spaces. 
One of these is Mt,C[a, b], the space of n by m matrices the entries of which are 
elements of C[a,b]. Thus F(t) =[fj,(t)] with f;,¢C[a,b]. The algebraic 
operations of addition, scalar multiplication and element multiplication carry over 
from Mt, and C and become meaningful in Wt,C[a, b]. There are several possible 
choices of the norm of 5. One is 


IFOM = sup Σ Wl (3.2.40) 
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Under this norm §M,,C[a, b] becomes a B-algebra. It is clear that ¥ (fo) for fo fixed, 
a < t ) <b, is a linear bounded functional on the space. More generally, we can 
replace f(t) in (3.2.36) by ¥(t) and obtain a linear bounded functional. Here the 
integral is interpreted as the matrix having in the place (j, k) the entry 


| fxs) dg(s). (3.2.41) 


Formula (3.2.38) can also be generalized in an obvious manner. 

There is another function algebra of considerable interest, namely the 
restriction of C[a,b] to functions having continuous derivatives of order & in 
[a,b]. We denote this set by C*[a, 8]. It is clearly an algebra with the operations 
defined as in C[a,b]. Let 


δι : 
IF Ί]0,κ = 2, i IF (3.2.42) 


be taken as the norm of fin C*[a, b], where the norm used in the right member is a 
sup-norm. It may be shown that (3.2.42) defines a norm and that C*[a, b] is com- 
plete under this metric. Hence it is a B-space. It is even a B-algebra since the weight 
factors 1/j! ensure that 


I fllon S WF lon 1σ]}ο,κ (3.2.43) 


by virtue of Leibniz’s formula for the derivatives of a product. By an obvious 
extension of formula (3.2.36) we can define linear bounded functionals on C*[a, δ]: 


k pb 
LUfj= ys f(s) dg (s), g (s) € BV[a, δ1. (3.2.44) 


Finally, we mention the space C®[a, b] of functions having derivatives of all 
order in [a,b]. It is clearly a linear vector space, even an algebra. It may be made 
into a complete metric space by a Fréchet type of a distance 


eg | 


qf, 9) = 2. ra Oro e (3.2.45) 


EXERCISE 3.2 


. Suppose that lim c, = Oandsetf, = c,,sin nt. Is this a Cauchy sequence in C[0, 27]? 
. Verify (3.2.6). 

. Verify the statements made concerning the modulus of continuity defined by (3.2.11). 
. Verify that (A) is bounded iff μ( — a; ζω) is bounded. 


. Prove that if μ is bounded and satisfies (3.2.15), then j(h) is (i) continuous, 
(ii) subadditive, and (iii) non-decreasing. 


a & Ww NO = 
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. Fill in omitted details in the proof of Lemma 3.2.1 and its corollary. 


7. Verify (3.2.22) and the statements made concerning binomial coefficients and nature 


10. 


11. 
12. 


13. 
14. 


15. 
16. 
17. 


18. 
19. 


20. 
21. 
22, 


of convergence of the series. [Hint: The formula of Wallis for 2 would help.] 


. [Edmund Landau] An alternative proof of Theorem 3.2.3 may be based on the 
formula 
2n+1)!! (1 
ieee | f(s) [1 — (s —1)?]" ds = f (0). 


Here fe C(0, 1] and the symbol (27)!! means the product of the even integers <2z. 
Similarly for (Qn +1)!!. The limit exists uniformly with respect to ¢ for 
0<eée<t<i — é. Start with f =1. The proof is analogous to that of Theorem 
3.2.4 inasmuch as a neighborhood of s = ¢ gives the main contribution and the 
integral over the rest of the interval goes to zero, again by the forniula of Wallis. 


. Suppose that / is a fixed bounded measurable function in the sense of Lebesgue 


(see Section 4.2). Assume the existerice of the integral and show that 


b 
| h(t) f(t) dt 


is a linear bounded functional on C[a, b]. Get an upper bound for the norm of the 
functional. 


How should g be chosen in (3.2.36) in order to give the functional of the preceding 
problem? 


Write (3.2.31) as an integral of type (3.2.36). 


Do the same for (3.2.34) if ΤᾺ is a Strictly increasing sequence and lim ¢, = ὁ. 


noo 
Verify the stated properties of the transformation (3.2.38). 
If |K(s, δ), < B for all (s, ε) in [a, b] x [a, 6], show that B(b — a) is an upper bound 


for the norm of the transformation. Find an example where this bound is actually 
the norm. 


Verify that Nt,,C[a, Ὁ] becomes ἃ B-algebra under the norm (3.2.40). 
Suggest alternative norms for this space. 


Show that {¥,,} is a Cauchy sequence under the norm (3.2.40) iff each of the n? 
sequences { ΩΣ J, Κα =1, 2, ..., nm, 1s a Cauchy sequence in C[fa,b]. Here 
Full) = {Se}. 

Verify that (3.2.41) defines a linear bounded functional on IM,,C[a, δ]. 


How should (3.2.38) be generalized so as to define a linear bounded transformation 
from Wt,,C[a, δ] into itself? 


Show that C"{a, δ] is a B-space under the norm (3.2.42). 
Verify (3.2.43). 
Verify that (3.2.44) defines a linear functional on Mt,,C[a, δ]. 
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23. Verify that C~[a, b] becomes a complete metric space if distances are defined by 
(3.2.45). 


24. (ἴα, b]is a separable vector space. How could this be proved on the basis of Theorem 
32.21 


25. Prove the same assertion on the basis of Theorem 3.2.3. 


3.3 FUNCTIONS OF BOUNDED VARIATION 


The class of functions BV La, b] was introduced towards the end of the preceding 
section. It is the set of complex-valued functions defined on the interval [a, δ] such 
that the total variation of f over [a, b] is finite. Here the total variation, denoted by 
V,"[f], is the greatest lower bound of the numbers M for which 


Σ If) --οἾκ.-..}} « Μ, (3.3.1) 


where 
Qa=t) Sty <t,<-:<t,=b 


is an arbitrary partition of La, δ] and (3.3.1) is supposed to hold with a fixed M for 
all possible partitions. 

This restriction on the variation of f does not force f to be continuous but it 
does put a restriction on the nature and the number of discontinuities. A dis- 
continuity always makes a positive contribution to the total variation. The sum of 
these contributions must be a convergent series. This means (i) that the set of 
discontinuities is countable and (ii) that each discontinuity is of the first kind, i.e. 


lim f(t—-hA)=f(t-0, lim f(t +A) = f(t + 0) (3.3.2) 
hyo h jo 


exist. Any discontinuity of the second kind would lead to unbounded variation in 
any interval containing such a point. We recall that any bounded real-valued 
monotone function is of bounded variation. Moreover, if fe BV[a, 6], then one 
can find four real-valued non-decreasing bounded functions such that 


f=hi tii -- — ify. (3.3.3) 


If f itself is real-valued, we may take f, =f, =0. After this review of the 
properties of the elements of the class BV[a, b] we can proceed to the structural 
properties of the space. 

The class BV [a,b] is a linear vector space over C if addition and scalar 
multiplication are defined in the obvious manner 


(f+ g(t) =f) + g(2), (3.3.4) 
(af )(t) = αὐ. 83.5) 
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Actually the product of two elements in BV [a, δ] also belongs to the class. For 


> If) σὺ -- Ια. 9(t- I 


k=1 


< Yd la) ~ 9G -DI+ Σ lated) -- GI 


< sup fl ΣΣ lat) -- 94-1 + sup lal Y LM) -- Λα. ον 
< sup [/()] Vela] + sup lg] VALI. 3.3.6) 


Here we have 
sup LAO! < Χὼὶ + VLSI, ΞΡ Ισ(1}} <lg@|+V.’[9], (3.3.7) 


so that, finally 
VT fal <\f@1 V."L9] + lg@| Vals] + 2 VLSI Μ [σ1. (3.3.8) 


On the other hand, the reciprocal of a function of bounded variation 7, belongs to 
BV (a, 6] iff inf | f(t)|, a < t < δ, is >0, in which case the total variation of f~* is 
at most 


ine L700 -2 YE FY, (3.3.9) 


The proof is left to the reader. 

It follows that BV[a, b] is a commutative algebra. It has a unit element, the 
function 1(t) which is identically 1 in [a, δ]. We say that fis a regular element of the 
algebra if f~' εἴ BV[a, b], otherwise singular. 

There are obviously divisors of zero since we can have 


f@) gt) =0 
without either factor being identically zero. 
In this algebra there is a profusion of idempotents. The equation 


[Lf Ξ (3.3.10) 


shows that the range of f must be one of the three sets {0}, {1}, and {0,1}. The 
first two possibilities give continuous idempotents, the functions O(t) and 1({). 
Any idempotent with the range {0,1} is necessarily discontinuous. If t = fo, 
a<t<b, isa point of discontinuity, each of the three numbers f(t) — 0), f(to), 
(to + 0) can be either 0 or 1 and there are four combinations which correspond to a 
discontinuity. Two of these correspond to one-sided continuity and contribute one 
unit to the total variation. The other two combinations involve /(f9) different from 
both f(t, — 0) and f(t) + 0) and contribute two units to the total variation. At the 
end-points only the first type of a jump can occur. Since the total variation of f has 
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to be an integer, only a finite number of discontinuities can occur but the number 
may be arbitrarily large. Thus there is a non-denumerable set of idempotents. 

On the other hand, there are no nilpotents except for zero. 

We can introduce a norm in BV [a, δὴ based on the characteristic property of 
the elements to be of bounded variation. The first choice that comes to mind is to 
take V,’[ f] as the norm of f. This does not work since all constants would then have 
the norm zero while only the zero element O{t) can be allowed to have this 
property. To get a better separation we could use some fixed linear combination 


of | f(a)| and ΚΓ], say 


If = ΙΧ] + 2 VCS. (3.3.11) 
Here the factor “‘2”’ is suggested by (3.3.8) and our desire to make sure that 
Ifgl < 1] Ισ! (3.3.12) 


as required for a B-algebra. The choice (3.3.11) ensures this. 

Since an element of BV [a, δ] is regular iff it is bounded away from zero, we 
see that the spectrum of f coincides with the closure of the range of f. Note that the 
range itself need not be closed. The spectrum is a bounded closed set in the 
complex plane which need not be connected and may contain isolated points. The 
situation here is totally different from that holding in C[a, b]. We still have 


RA S)(t) = (3.3.13) 


l 
λ-- f(t)’ 
however. 

The problem of constructing linear bounded functionals is harder for BV [a, δ] 
than for C[a,b]. The theory of the Riemann-Stieltjes integral is still suggestive, 
however. The point of departure is the formula for integration by parts: 


b b 
] f(t) dg(t)  Ξ Ϊ g(taf(t). 63:14 


Here fis an arbitrary element of BV [a, b] and g is a fixed element of ([α, b]. The 
formula serves to define the integral of a function of bounded variation with 
respect to a continuous function. To us it gives a means of defining linear bounded 
functionals on BV [a,b]. The left member is clearly linear in f and its value is a 
number. Thus to each g € C[a, δ] corresponds a linear functional on BV [a, δ] 
defined by (3.3.14). This functional is bounded since 


< (FOI 9) + IF@|l9@! + [σ]}ς VL] 


« 2|σ}}ς IA@I) + Val ST} <2 Malle 1] (3.3.15) 


by (3.3.11). Here the subscript C indicates the sup-norm used in C[a, b] while the 
unmarked norm is that defined for BV [a, δ]. 


{ f(t) dg(t) 
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In this manner we can define a class of linear bounded functionals on 
BV [a, δ]. There are linear bounded functionals, however, which do not fit into this 
pattern, so we have given only a partial solution of the problem. The adjoint space 
of BV [a, δ] contains C[a, b] as a proper subspace. 

Finally, we shall define a class of linear bounded transformations from 
BV [a, δ] into itself analogous to those defined in C[a,b] by formula (3.2.38). 
Set 


b 
g(t) -{ K(s, t) df(s), (3.3.16) 


where fe BV[a, b] and the kernel K(s,t) is to be chosen so that g ε BV[a, δ] 
whenever f does. Since 


n bon 
Y lo) - σας! «[Σ IK W Kou WaT} 4.317 


it is sufficient for our purpose if (i) K(s, ft) is continuous in s for each fixed ¢, and 
(ii) K(s,¢) is of bounded variation in ¢ for fixed s, the total variation being a 
bounded function of 5. Both conditions are satisfied if, for instance, K(s, t) and 
K,(s, 2) are continuous functions of (5, 1) in [a,b] x [a,b]. In formula (3.3.17) 
the symbol V,*[f] stands for the total variation of f(t) in the interval [a, 5]. 


EXERCISE 3.3 


1. Show that V,°[f] is a non-negative, non-decreasing, and bounded function if 
fe BV[a, δ]. and hence also belongs to this space. 


2. Let f be a real-valued element of BV[a, b]. Then 
f(t) = Κα -- {v.11 — f@}. 
Show that each of the components is real-valued and non-decreasing. Use this to 
prove (3.3.3) for complex-valued elements of BV{a, δ]. 


3. If f is a real-valued, non-decreasing element of BV[a, δ], show that the number of 
discontinuities is denumerable, that one-sided limits exist everywhere, and that 
f(t — 0) < f(t) < f(t + 0). Use this to prove the assertions about discontinuities 
made in the text. Is BV[a, b] a separable space? 


4. Prove the assertions made concerning f ~‘ and, in particular, the estimate (3.3.9) for 
the total variation of f 71. 


5. If fis a non-trivial idempotent, show that it is a divisor of zero by exhibiting a g in 
BV{a, b) such that g is also a non-trivial idempotent and their product is the zero 
element. 


6. Verify that (3.3.11) defines a norm and that (3.3.12) holds. 
7. Verify (3.3.14). 


3.3 
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. Fill in omitted details in the proof of (3.3.15). 
. The inequality (3.3.17) is based on the estimate 


b b 
| G(s) df(s) <| |G(s)| αν, Ἱ1, 


where G(s) is continuous in [a,b]. Verify this inequality by considering the | 
corresponding Riemann-Stieltjes sums. 


. Fill in other omitted details in the discussion of (3.3.16). 


. What conditions should K(s, 1) satisfy in order that (3.3.16) define a linear bounded 


transformation from BV[a, b] to C[a, b]? Get an upper bound for the norm of such a 
transformation. 


. Discuss the space Wt, BV[a, Ὁ] along the lines of the corresponding discussion for 


Wi,Cla, δ] in Section 3.2. The elements of Nt,BV[a, δ] are n by n matrices { f;,(t)} 
with entries in BV[a, δ]. 


COLLATERAL READING 


For a review of the elementary properties of sequences, continuous functions, and functions 
of bounded variation the reader may find one of the following treatises useful: 


BARTLE, R. G., The Elements of Real Analysis, Wiley, New York (1964). 
HILtE, E., Analysis, 2 vols., Blaisdell, New York (1965). 


See also the references under Chapter 2 for the functional analytical aspects of the spaces 
in question. 


4 LEBESGUE SPACES* 


This chapter is a continuation of the study of special linear vector spaces. Here we 
Shall be concerned with integrable functions. Integrable in what sense? As an 
orientation consider the space of continuous functions on [α, 6], where, however, 
we introduce a different metric than that defined by the sup-norm. Set 


diay ] ΓΙ 


where the integral is taken in the Riemann sense. This clearly defines a normed 
metric, but the space is no longer complete. If we complete the space by adjoining 
the limit functions of all Cauchy sequences, we obtain a much larger space, namely 
L(a, b), the space of Lebesgue integrable functions. More generally, we could 
consider the corresponding problem in C” and for a different original metrization. 
There is a continuum of Lebesgue spaces L,, | < p < οὐ, even on the line. These 
spaces will be the center of attention of this chapter. To make the discussion self- 
sufficient we start with a fairly detailed discussion of Lebesgue measure and 
integration. We also include the elements of the theory of Fourier series which, in 
the L, case, will give another concrete realization of a Hilbert space of infinite 
dimensions. 

There are five sections: The m-dim Lebesgue measure; Lebesgue measurable 
functions; Lebesgue integration; Lebesgue spaces; and Remarks on Fourier series. 


4.1 THE m-DIM LEBESGUE MEASURE 
We start with the notion of a o-algebra (or a o-field). 


Definition 4.1.1, A non-empty collection A of subsets of a set X is called a 
o-algebra iff 


1) O, X belong to A. 

2) SEA implies X OS=S‘EA. 

3) If {S,} is a sequence of sets in A, then the union \) S,, belongs to A. 
n=1 


* 1 am indebted to Dr. [h-Ching Hsii for a thorough revision of this chapter. 
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An ordered pair (X, A) consisting of a set X and a o-algebra A of subsets of X 
is called a measurable space. Any set in A is called a measurable set (more exactly 


Β5 


Definition 4.1.2. A measure is an extended real-valued function μ defined on a 
o-algebra A such that (1) u(@) = 0, (2) u(S) > 0 for all Se A, and (3) μ is 
countably additive in the sense that if {S,\ is any sequence of disjoint sets in A, 
then 


sal 


u( U) s,) = 5 u(Sy) 


Since we permit yu to take on +00, the series °° , u(S,) may be a divergent 
one. If a measure does not take on - οὐ, we say that it is a finite measure. We shall 
now list, without proof, a few simple results that will be needed Jater. 


Lemma 4.1.1. Let μ be a measure defined on a o-algebra A. If S, and 5. 
belong to A and 5, « 55, then w(S,)<u(S2). If pw(S,) < +0, then 
u(S, © 54) = u(S2) — μίϑ). 


Lemma 4.1.2. Let μ be a measure defined on a a-algebra A. 


1) If {S,} is an increasing sequence in A, then u( J s, = lim u(S,). 
n=1 


no 


2) If {T,} is a decreasing sequence in A and if u(T,) < +00, then 


μ ἢ %) = tim wT. 


n> © 


Definition 4.1.3. A measure space is a triple (X, A, μὴ consisting of a set X, 
a o-algebra A of subsets of X, and a measure p defined on A. 


As acceptable sets of exception, sets of measure zero are of great importance 
in the theory of measure. Before we see the role of sets of measure of zero, we 
introduce the terminology almost everywhere. A property is said to hold almost 
everywhere (abbreviated a.e.) if the set of points where it fails to hold is a set of 
measure zero. Thus in particular we say that two functions f/, g are equal p-almost 
everywhere or that they are equal for p-almost all x if fand g have the same domain 
and μίχ; f(x) τ g(x)} = 0. In this case we will often write f = g, p-a.e. Similarly, 
we often write f = lim /, μ-ἃ.6, if there is a set S of measure zero such that f(x) 

nro 
converges to f(x) for each x not in S. 

After Definitions 4.1.1, 4.1.2, and 4.1.3 are given, examples can follow easily. 

As an illustration, let X be a given non-empty set, then apparently the power set of 


X, 2* (i.e. the set of all subsets of X) is a a-algebra. So (X, 2*) is a measurable 
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space. Moreover, let P be a fixed element of X. A function py on 2* can be defined 
as follows: 
(S) 0 if P€S, 
i Ϊ if PeS. 


It is readily seen that p is a finite measure; it is called the unit measure concentrated 
at P. By definition (X, 2*, ) is a measure space. 

A book on measure theory may easily go further on methods of constructing 
o-algebras and measures. To make our material self-sufficient, we introduce one 
method of defining Lebesgue measure on the real line R, then apply the same 
technique to get the Lebesgue m-dim measure. 

The procedure that we employ is the following: we shall obtain a set function μ᾽ 
defined for all subsets of R, and then pick up a collection of sets which forms a 
o-algebra and on which p* becomes a measure. The length /(J) of an interval 1 is 
defined, as usual, to be the difference of the coordinates of the endpoints of the 
interval. For each set S of the real line consider the countable collections {J,,} of 
open intervals which cover S, that is, collections for which S c ()*_, I, and for each 
such collection consider the sum, >\~_ , {(1,), of the lengths of the intervals in the 
collection. This sum is well defined, independently of the order of the terms, since 
the lengths are positive numbers. If S is an arbitrary subset of R, we define 
u*(S) = inf | /U,), where the infimum is extended over all countable collections 
{I,,} of open intervals such that S c U)_, I,. Though p* is not generally a measure, 
μὲ does have a few properties reminiscent of a measure. 

Without proofs, we list the following: 


Lemma 4.1.3. The above-defined function μὴ on the power set of R satisfies 
1) p*(@) = 0. 

2) μ (8) > 0, for 5 «- Κ. 

3. If ScT CR, then μ (5) « p*(T). 


4) If {S,} is a sequence of subsets of R, then μ" ) S,] <> y*(S,). 
n=1 n=1 


Lemma 4.1.4. For every interval I, half-open or open or closed, finite or infinite, 
μ'(ὴ = Κη. 


Lemma 4.1.3 motivates the following definition of outer measures in an 
arbitrarily given set X. 


Definition 4.1.4. An outer measure in a set X is a non-negative function defined 
on the set 2* of all subsets of X such that 


1) μἴ() = 0. 
2) μὲ is monotonic, that is, u*(S) < w*(T) whenever Sc Tc X. 
3) μὲ is countable subadditive, that is, 


u( ι} 5.) < > u*(S,,) for every sequence {S,,} of subsets of X. 
n=1 n=1 
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subsets of a set X. It is just as seldom that one picks a o-algebra a priori and then 
defines directly a measure on it. Almost always one starts with some family B of 
subsets of X (for example, intervals) and a non-negative function / on B (for 
example, length of intervals) and then one tries to generate a measure from these. 
We now discuss a fundamental process for doing so which is due (1918) to 
C. Carathéodory (1875-1950). It consists of two parts. First, it produces from / 
and B an outer measure p* defined on all the subsets of X. This μὲ, however, is not 
countably additive in general, but only subadditive. The second part of the 
process then selects a o-algebra M on which p* is countably additive. 


Definition 4.1.5. Let μὴ be an outer measure in a given set X. A subset S of X is 
said to be w*-measurable iff for each subset T of X we have 


μ᾽ (ΤῈ) = w*(T OS) + μ᾽ Θ δ). 
While a proof is asked in Exercise 4.1, we state the following. 


Theorem 4.1.1. Let u* be an outer measure in a given set X, and let M denote 
the collection of all u*-measurable sets. The following statements are true. 


1) If u*(S) = 0, then SEM. 

2) M is a o-algebra in X. 

3) μὴ is countably additive on M. (Hence if u denotes the restriction of μὲ to M, 
then μ is a measure on M.) 


Starting with the family B of all open intervals of the real line R, and using the 
length of intervals as a non-negative function / on B, we have already constructed 
an outer measure μὲ in Κα. As a special case of Theorem 4.1.1, we present the 
following. 


Theorem 4.1.2. The family M of all *-measurable subsets of R is a o-algebra 
and the restriction of p* to M, denoted by μ, is a measure on M. This p is called 
the Lebesgue measure on R. A subset S of R is called Lebesgue measurable iff S 
is a member of M. 


For the construction of Lebesgue measure in Euclidean m-space, R”, the basic 
ideas are the same as in the case of R. Only the details are more complicated. 
A point ΡῈ R”™ is an m-tuple (x,, x2, ..., x,,) of real numbers. An open block (or 
open interval, open parallelepiped) in R™ is a set of the form B = {Pla; < x, < δι; 
i=1,2,...,m}. By convention, ὁ; (1 <i<m) or a;(1 <i <™m) may take on +00 
or —oo to form infinite open blocks. By introducing an obvious modification, a 
closed block, finite or infinite, in R™ can be defined similarly. We shall also be 
interested in half-open blocks: {Pla; < x; < b;,i =1, 2, ..., m}. For blocks in R”, 
half-open, or open or closed, finite or infinite, the quantity 


ν(Β) = (ὁ, — αι) ... Om — Gm), 
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which may be +00, will be called the volume of the block B. The reader will 
recognize that v(B) agrees with the usual definitions of length, area, or volume, 
according as B is an interval of R, a rectangle of Κ΄, or a rectangular parallelepiped 
of ΚΕ΄. If Sis an arbitrary subset of R”, we define u*(S) = inf>°*. , v(B,), where the 
infimum is extended over all countable collections {B,} of open blocks in R™ such 
that S c UU“, B,. Without giving proofs, we state the following, which are general 
cases of Lemmas 4.1.3 and 4.1.4. 


Lemma 4.1.5. μὴ is an outer measure in R". 


Lemma 4.1.6. For every block B, half-open, or open or closed, finite or infinite, 
u*(B) = v(B). 


Theorem 4.1.1 suggests the following special case. 


Theorem 4.1.3. The family M,, of all p*-measurable subsets of R™ is a 
o-algebra, and the restriction of μὲ to M,,, = μΜ,,» is α measure on M,,. 
This p is called the m-dim Lebesgue measure on R™. A subset S of R™ is called 
Lebesgue measurable iff S is a member of M,,. 


Lemma 4.1.7. Let c be any finite real number, and r any one of the integers 
1, 2, ..., m3; then the sets {P|(x1, X2, .... Xp -++) Xm) = ΡῈ Καὶ, x, > c} and 
{P| (X45 X25 ..., Xpy νον, Xm) = ΡῈ Κα, x, < οἱ, so-called special half-spaces of R”, 
are Lebesgue measurable. 


Proof. Let H be a special half-space of Κα. In order to establish the equality 
u*(T) = w*(T OH) + w(T OA) = μἜΓΤ OF) + μ' (Τὼ Η), for any subset T 
of R™, it suffices to prove u*(T OH) + p(T AH‘) < p*(T), since μὲ is sub- 
additive. If u*(T) = oo, then there is nothing to prove. We assume p*(T) < oo. 
In case {B,} is a sequence of open blocks such that T < U72, B,, then TOH «Ὁ 

ne, (B, AH) and TO Hc US, (B,  Η“). Hence, by monotonicity and count- 
able subadditivity of p*, 


p(T OH) + μι ΤΗΣ < Σ μέ(Β, OH) Σ μ(8, 0H) 
n=1 n=1 


= Σ [μ᾽ (8, ἡ H) + μ' 8, 0 H)). 
n=1 
If we show that for every open block B in R™ 
u*(B) = μ (Βα A) + w*(Bo HX), (4.1.1) 


then p(T OH) + μ ΓΤ ἡ Η « Σας, we (B,), μ'(Τ ἡ Η)- μΓΤ ἡ Η) < 
ΠΓΣ, u*(B,) = μἜ(ΤΊ), and the theorem is proved. 

Suppose B is given by {P|P eR”, a; < x; < 5;,i =1, 2, ..., m}. Without loss 
of generality, we may assume that H is given by {P|P eR”, x, < c}. Since (4.1.1) 
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is obviously true if c < a, or c > by, let us suppose a, < ς < dy. Consequently, 
Bo H is an open block given by 


{PIPER™, a, <x,<c and a,<x;<b,, b= 2,3; 400m) 
Bo Η“ is the set given by 
{PIPER™,c <x, <b, and a;<x,;<b, i= 2,3,...,m}. 


It can be verified easily that 


μ"(Β oH) = (ce -- αὐ TT 6, a), 
H*(Bo H) = (δι — ὦ ΠῚ ὦ; — αὐ, and 
i=2 


WB OH) + w(Bo HY) = ὧι -- αὐ T] 6, -- αὐ = μ"(8). 


Lemma 4.1.8. Every half-open block B in R™ is Lebesgue measurable and 
μ᾽ (δ) = u(B) = v(B). 


Proof. By previous lemma, for each i, 1 <i<m, H; is Lebesgue measurable, 
where H; = {P|Pe R™, οἱ < x;}. The complement K; of H;, K; = {P|P eR”, 
οἰ > x;}, is therefore also Lebesgue measurable. Without loss of generality, 
suppose a half-open block B is given by B = {P|b; <x; <c;, i= 1, 2,.., mh. 
Then B= ῆΐξ, (J; Kj), where J; = {P|Pe R", δι < χὰ, i=1,2,...,m. Asa 
finite intersection of Lebesgue measurable sets, B is therefore Lebesgue measurable. 
It follows from the definition of Lebesgue measurability and Lemma 4.1.6 that 


1*(B) = μ(Β) = v(B). § 


Lemma 4.1.9. Every open set in R™ is the union of a countable collection of 
disjoint half-open blocks. 


Proof. Let G be an open set in R”. For each positive integer k, the hyperplanes 
RSD, HS τε] ὦ 1:2. nee. > be he DS aac (4.1.2) 


partition R™ into a countable collection of disjoint half-open blocks. Let 
B,*, B,’, By’, ... be a collection of such blocks generated by (4.1.2) for k = 1 that 
are contained in G. We use recursion to define suitable B?’s. For k > 1, let 
B,’, B,”, B,°, ... be the collection of half-open blocks generated by (4.1.2) which are 
contained in G but not contained in any block B,? with] <q <k. If PEG, then 
P is an interior point; so there is a partition of R” given by (4.1.2) such that the 
block containing P is contained in G. Therefore 


C8 


|) BZ > G. 
J 


k 


1 
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Since B, < G for each j, k, we have 


cO 


\) ΒΞ 


k= 1 jf 
This is clearly a countable collection of half-open blocks, and we have constructed 
them so they are disjoint. ἢ 


Theorem 4.1.4. An open set in R™ is Lebesgue measurable and so is a closed set, 
which is the complement of an open Set. 


Proof. The proof follows from Lemmas 4.1.9 and 4.1.8. al 


Any Lebesgue measurable set can be approximated in measure by open sets 
from above and by closed sets from below in the following sense. 


Theorem 4.1.5. Let S be any Lebesgue measurable set in R™. Given e > 0, 
then there exists an open set G, in R™ and a closed set F, in R™ such that 
F,< 5 « 6, and w(G, © S) < ε, u(S © F,) < ς. 


Proof. First, assume that S is bounded. Then p*(S) = p(S) is finite. Given 
¢ > 0, then, by definition of u*, there is a countable collection of open blocks {B,} 
such thatS c U*, B,and>\*_, v(B,) = 1 u(B,) < w(S) + δ. Put UP, B,=G,, 
then G, is an open set in R”. Furthermore, u(G,) = μί κε, B,) < So. u(B,) < 
u(S) + «. Since S <G, and p(S) < +0, it follows from Lemma 4.1.1 that 
u(G, © S) = u(G,) — μ(ϑ) < εξ. Secondly, suppose that S is an unbounded 
measurable set in R”. For each positive integer ἢ, consider the following open 
block ,D in R”™ 
wD = {P|PER", —n<x;<n, i=J,2,..., m}. 


As an open set in R™, ,D is measurable, by Theorem 4.1.4. Clearly, for each n,, Dis 
bounded. From R” = U%.,,D, it follows that S=SOR"=SoO[U,,D]= 
nz1(90,D) = Urz,S,, where δ, = 5. ὦ 2. The previous results now apply to 
the bounded measurable set S,: For ὃ > 0, and for each positive integer ἢ, there is 
an open set G,,-, =G, in R™ such that S,< G, and μίσ, © S,) < 62™". Let 
G = Ux_,G,, then G is open in Κα, G2 SandGOS=U*,,6,0U215, ¢ 
Un=1(G,OS,). Thus μία Θ 5) < μ[ἰ κε, (G, © S,)] < Diz 1 HG, © S,) < 
ΡΒ ae 
To obtain the rest of the assertion, let Τ = Κα © S, then T is measurable. 
Therefore for each ¢ > O there is an open set G, in R” such that T c G, and 
u(G,©O Τὴ « εξ. Let F, = R" O G,, then F, is closed in R” and Εἰ, < δ. Moreover, 
since TOF, Ξ ας SOF, = (R"O T)OF, = (R" OF, OT =G6,0T. Thus 
WSOF) - μ(6, Θ ΤΊ <=. fj , 


EXERCISE 4.1 


1. Let X be any non-empty set. Call a subset S of X cocountable (in X) if its com- 
plement X © S is countable. Prove that the collection of all those subsets of XY that 
are either countable or cocountable is a o-algebra of subsets of X. 


4.2 


4.2 
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. Let B be a non-empty collection of subsets of XY. Clearly, the family of all subsets of 


X is a g-algebra containing B. Prove that the intersection of all the o-algebras 
containing B is also ἃ o-algebra containing B. This is the smallest c-algebra generated 
by B. 


pe ] | a Ι l 
. Prove that a,b] = ——,b + —}], (a,b) = +—,b-~- —},. 
la, δ] 0, (4 n τ) ὦ ᾿ U a n τ᾿ 


n=] 
Hence any o-algebra of subsets of R which contains all open intervals also contains 


all closed intervals; any o-algebra containing all closed intervals also contains open 
intervals. 


. If is a measure on a o-algebra A and 4 is a fixed set in A, show that the function J, 


defined for Se A by ACS) = μξϑ A A), is a measure on A. 


. Prove Lemma 4.1.2. Show that part (2) of Lemma 4.1.2 may fail if the finiteness 


condition u(7,) < +00 is dropped. 


. Let δὲ, S5, ... be a sequence of subsets of XY. The set of all points P € X such that 


ΡῈ δ, for infinitely many values of n is called the /imit superior of {S,, and denoted 
by lim sup S,. It may be verified that 


lim sup S, = A) LU s.]. 


noo n=1 


The set of all points P € X such that Pe S, for all sufficiently large values of n (how 
large may depend on P) is called the /imit inferior of {S,,} and denoted by lim inf S,. 
It may be verified that a 
CO οΌ 
lim inf S, Ξε (J π᾿ δι}: 
noo n=1 k=n 
Let (X, A, μ) be a measure space and let {S,,} be a sequence in A. Show that 
(lim inf δ.) < lim inf u¢S,). Also show that lim sup u(S,) < “im sup S',) when 


no n> @ now n> oO 


μῦ 5. OO: 
n=1 


. Prove Lemma 4.1.4 and Theorem 4.1.1. 
. Let X be any non-empty set. Let u*(O) = 0, u*(X) = 2, and w*(S) = 1 for all other 


sets. Show that p* is an outer measure, and determine the class of p*-measurable 
sets. 


. Let μὲ be an outer measure in a set XY. Suppose that {S,,} is a sequence of disjoint 


u*-measurable sets and S = J), S,. Show that u*(A 0S) = 7, μ ἡ δι) 
for any subset A of _X. 


LEBESGUE MEASURABLE FUNCTIONS 


We shall now take up the theory of extended real-valued measurable functions 
with domains in R”™. 
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Definition 4.2.1. An extended real-valued function f defined on a subset of R™ 
is called (Lebesgue) measurable if its domain is Lebesgue measurable and if for 
each real number «a, the set {P| f(P) > «} is Lebesgue measurable. 


Lemma 4.2.1. Let f be an extended real-valued function with a measurable 

domain. Then the following statements are equivalent: 

1) For each real number «, the set {P| f(P) > αἱ is measurable. 

2) For each real number a, the set {P| f(P) < a} is measurable. 

3) For each real number «, the set {P| f(P) > αὐ is measurable. 

4) For each real number «a, the set {P| f(P) < «} is measurable. 

These statements imply: 

5) For each extended real number «*, the set {P| f(P) = «*} is measurable. 
Proof. Since {P| f(P) < αὐ and {P|f(P) > αὐ are complements of each other, 


statement (1) is equivalent to statement (2). Similarly, statements (3) and (4) are 
equivalent. If (1) holds, then 


(PLF(P) > a} = ἦ [ριχρ)»α - πὶ 


is measurable, since the intersection of a sequence of measurable sets is 
measurable. Hence (1) implies (3). Similarly, (3) implies (1), since 


τ ] 
(Pif(P) > a} = U {Pie >a+—|, 


and the union of a sequence of measurable sets is measurable. This shows that the 
first four statements are equivalent. If αἴ is a real number, 


{P| f(P) = a*} = {P| f(P) < αἴξ {PI f(P) > o*}, 
and so (2) and (3) imply (5) for «* real. Since 


{PL/(P) = 00} = ἢ {PLS(P) > η), 


(3) implies (5), for αὖ = οὐ. Similarly, (2) implies (5) for αὖ = — oo, and we have 
(2) and (3) imply (5). ἢ 


Lemma 4.2.2. Let f and g be real-valued measurable functions and let c be a real 
number. Then the functions cf, 72, f+ 9,f — 9, fg, |f\ are also measurable. 


Proof. Let Dand E be the domain of f and g respectively. Then D ἡ E, which is 
the domain of f+ g and fg, is measurable, since D and E are measurable. 

If c = 0, the statement is trivial. If c > 0, then {P|cf(P) > «} is the same as 
{P| f(P) > a/c}, which is measurable for each real «. The case c < 0 can be handled 
similarly. 
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The function f? is measurable, since 


{PIf?(P) > a} = {P| f(P) > Va} ὦ {PIS(P) < —Va} for «> 0 
and 
{P\f7(P)>ea=D if «<0. 

If f(P) + g(P) « α, then f(P) < a — g(P) and by the density of rational 
numbers there is a rational number q such that f(P) <q «α -- g(P). Hence 
{PIf(P) + g(P) < a} = Uy ({PIF(P) < g} 7 {Plg(P) < α — q}). Since the 
rationals are countable, this set is measurable and so f+ g is measurable. Since 
—g = (-- ᾿)4 is measurable when g is, we have f — g measurable. 

The measurability of fg follows from fg = 4[(f + 4)2 — (7 — g)”] and those 
results we just obtained. Finally 

{P;|f(P)]>a}=D for α «0, 
and 
{P. |f(P)| > a} = {PS(P) > 9} U {PLS(P) < --αὐ if «20. 


Thus the function | f| is measurable. ii 


If f is any real-valued function, let f* and f~ be the non-negative functions 
defined by f*(P) = max { f(P), 0}, 7}, (P) = max {—f(P), 0}. The function f* is 
called the positive part of f, f~ the negative part of f. Clearly, f=f" —f- and 
\f1 =f" +f. Consequently, f* = 4(|f| +f) and f~ = 40 f| —f). It follows 
from the previous lemma that fis measurable if and only if f * and 7 are measur- 
able. If fis an extended real-valued measurable function, it should be noted that 
| f| + fis not defined at points where | f| = οὐ and f = —oo. However, the 
measurability of f* and f~ can still be proved by appealing to the definitions: 

f° (P) = max {f(P),0} and f~(P) = max {—f(P), 0}. 


Applying the previous lemma, we can easily check the following 


Theorem 4.2.1. Let S be a Lebesgue measurable set. Then the set of all real- 
valued measurable functions defined on S forms an algebra over the reals. 


In order to see that measurability is preserved in “passing to the sequential 
limit,”” we state the following lemma, whose proof can easily be supplied. 


Lemma 4.2.3. Let {f,} be a sequence of extended real-valued measurable 
functions with common domain a measurable set S. Define the functions 


f(P) = inff,(P), Ε(Ρ) = sup f,(P), 
7 *(P) = lim inf f,(P), F*(P) = lim sup f,,(P). 
Then f, F, f*, and F* are measurable. 
Theorem 4.2.2. If {f,} is a sequence of extended real-valued measurable 


functions with a measurable set S as common domain, if { f, converges to f on S, 
then f is measurable. 
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Proof. In this case f(P) = lim/,(P) = lim inf /,(P). fj 


Theorem 4.2.3. Let D be a measurable subset of R”. If an extended real-valued 
function f is defined and continuous on D, then f is measurable. 


Proof. Vf « is any real number, then {P| f(P) > a} = f7'(a, oJ. (a, 00] is an 
open subset of the extended real line. {P| /f(P) > αἱ, as the preimage of (a, oo] 
under the continuous function /, 15 therefore open in D (with the relative topology). 
There exists an open set G, of R™ such that {P| f(P)>a}$=G,oD. The 
measurability of f follows from Theorem 4.1.3. 


So far we have restricted ourselves to extended real-valued function f. If f 
should be complex-valued, we say fis measurable iff its real and imaginary parts are 
measurable in the previously defined sense. In Exercise 4.2 it is asked to prove that 
sums, differences, and products of complex-valued measurable functions are 
measurable. Note that for a complex-valued measurable function f we have a 
decomposition of the form 


F(P) = fi(P) + tf (P) — fa(P) — if4(P), 


where f, to f, are real non-negative measurable functions. For example: Let g(P) 
be the real part of f(P), h(P) the imaginary part of f(P), take 


ΚΡ) Ξ- “ (, A(P)=9 (P) A(P)=A(P), Ἀ(Ρ)- hh (P). 


The decomposition is clearly not unique: we can add the same constant c, to f, and 
jf; and the same constant c, to ᾧῷ and f, without affecting the result. 


EXERCISE 4.2 


In Exercise 4.2, by measurable functions we shall mean Lebesgue measurable functions. 


1. Let D be a dense set of real numbers, that is, a set of real numbers such that every 
interval contains an element of D. If fis an extended real-valued function on R such 
that {P|f(P) > d} is measurable for each de D, prove that f is measurable. 

2. Suppose f = g a.e., show that f 15 measurable iff g is measurable. 

3. (a) Let f be an extended real-valued function with measurable domain D, D a subset 
of Κα, Let δ, = {P|f(P) = οὐ, δ; = {P|f(P) = --οοὐ. Then fis measurable 
iff D, and D, are measurable and the restriction of f to DO (D, UD,) is 
measurable. 

(b) Prove that the product of two measurable extended real-valued functions is 
measurable. 

(c) If fand g are measurable extended real-valued functions and «a a fixed number, 
then f + g is measurable if we define f + g to be ~ whenever it is of the form oo — oo 
OF — OF Os; 
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4. Let f be an extended real-valued measurable function and let c be a fixed positive 
number. Prove that the truncation f, defined by 


SAP) =f(P) if |fPI<e, 
-- ς if f(P)> ο, 
=—c if f(P)<-—c 
is measurable. 
5. If fand g are extended real-valued measurable functions with the same domain of 
definition D, show that the extended real-valued functions h and k defined by 
h(P) = max { f(P), g(P)}, VPeD, 
k(P) = min { f(P), g(P)}, VPeED, 


are measurable. In particular, f* and f~ are measurable if fis an extended real- 
valued measurable function with 


f*(P) = max {f(P), 0} and f~(P) = max {—f(P), 0}. 
6. Prove Lemma 4.2.3. 


7. If f(P) is measurable and non-negative, show that its positive square root has the 
same properties. 


8. Let fbe a real-valued measurable function with domain R”, and let @ be a continuous 
function on R™ to R™. Show that the composition @o/f, defined by 
(pof)(P) = P{f(P)], is measurable. 


9. Show that sums, products, and limits of complex-valued measurable functions are 
measurable. 


4.3 LEBESGUE INTEGRATION 


With the Lebesgue theory of measure at our disposal, we can generalize the problem 
of “measuring the area under a curve.” First we introduce the notion of a set of 
ordinates basic in the geometric treatment of integration. 


Definition 4.3.1. Let S be any subset of R™ and let f(P) be an extended real- 
valued non-negative function on S; let 

OQ (FS) = Cg eo eens serra cer 4} = (Kiki ΧΗ Ὲ S and 0 < Xm+1 <f(P)}, 
Ω,(); 5) = {45X25 06s Xe Xm IP = (%1,X2, vey Xm) ES and Ὁ «χει <f(P)}; 
then Q)(f;S) is called the least, and Q,(f;S) is called the greatest set of 
ordinates of f(P) over S. The symbol Q(f;S) is used to denote any set T 


satisfying Qo(f;S) < T < Q,(f;S), and T is called a set of ordinates of f(P) 
over S. 


It is readily seen that Q(f; 5) < R™**. We must now also consider (m + 1)- 
dimensional sets. Later we shall write μι, and y,,4, for Lebesgue measures in R™ 
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and R™*', respectively. At this stage we restrict ourselves to extended real-valued 
non-negative functions. After we extend to arbitrary real-valued functions, there 
will be a natural way to deal with complex-valued functions. 


Lemma 4.3.1. Let Z be a cylinder in R™** with height h and base B, i.e. 
LA 15 0350505 ins Mae is La ες} EB and O< x5 < hh. 


If B is measurable, then Z is measurable and p,,,,(Z) = hu,(B). 


Proof. Suppose B is a half-open block. It can be verified easily (Lemmas 4.1.7 and 
4.1.8 should be helpful) that in this case Z is measurable and 


If B is an open set, B can be written as the union of a countable collection of 
disjoint half-open blocks by Lemma 4.1.9. Consequently, (4.3.1) holds for any open 
base B by the countable additivity of the measure. It is then also true for a closed 
base. (Why? Hint: Consider two cases, i.e. B is bounded or unbounded.) To deal 
with an arbitrary measurable base B, we apply Theorem 4.1.5, which states: For each 
€ > 0 there exists an open set G, and a closed set F, such that F, < B < G, with 
Hm(G, © B)<eé and pu,(BOF,)<eé. From this it follows that Ay, (B) — 
he < p*(Z) < hu,,(B) + he for each ¢ > 0. Therefore u*(Z) = hu,,(B). To prove 
that Z, with a bounded measurable base B, is an (m + 1)-dim measurable set, we 
introduce another criterion of measurability. (Cf. Exercise 4.3.3.) Given a subset 5 
of R™** with μ᾽ (5) < οὐ, S is measurable iff y*(S) = p,(S), where p,(S) is so- 
called Lebesgue inner measure of S and is defined by 


My(S) = sup {U,+1(E)|E is measurable and E c 5}. 


By Theorem 4.1.5, for any ¢ > 0 there is a closed set F, such that F, ¢ B and 
μ,(Β © F,) < εξ. From this it follows that u,(Z) = hu,,(B) = μἘ(Ζ). Thus Z is 
measurable with μ(Ζ) = u*(Z) = hu,,(B), provided B is bounded and measurable. 
Finally, suppose Z has an unbounded measurable base B. Then, as shown in the 
proof of Theorem 4.1.5, there is an increasing sequence of bounded measurable sets 
whose union is B. Apply Lemma 4.1.1 to infer that (4.3.1) holds for an unbounded 
measurable base B. ἢ 


Theorem 4.3.1. Let f(P) be an extended real-valued non-negative function 
defined on an m-dim measurable set B. Suppose that f(P) is measurable on B, 
then Qo(f; B) is an (m + 1)-dim measurable set. 


Proof. We start with the proof that Q,(f; B) is the union of a sequence of 
cylinders (except when f(P) = 0 for every P of B, in which case Q,(/f; B) = @). 
Consider the set of all positive rationals. This set is countable; hence its elements 
may be listed h,,/2,h3,...,h,,.... Let B, = {P|PeB,h, < f(P)} (r =1,2,...). 
Let Z, denote a cylinder in R”** with height ἡ, and base B,, i.e. 


Ζ, = 1 (Xi. X2; see9 Ximy Xm+ DI(X1, X2, a Nea) Ε B,, 0 < Xm+1 < h,\. 
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Clearly U7, Z, <¢ Oo(f; B). On the other hand, if 


(P, Xm+1) = (x1, X25 X35 +009 Χη» Xm+ 1) Ε Ωρ 5); 


then there is a positive rational number, say A,, such that 0 « x,,4, < h, <f(P), 
which means that (P,x,+1)¢€Z,; hence Qo(f; B) < U2, Z,. Thus Q,(/;B) is 
the union of a sequence of cylinders. 

To prove the measurability of Q,(/; B), it suffices to prove that Z, is 
measurable for each r. Since f is measurable, B, is measurable for each r. By 
Lemma 4.3.1, Z, is measurable. JJ 


The measurability of Qo(/; B) enables us to give the following 


Definition 4.3.2. Let f be an extended real-valued non-negative function defined 
and measurable on an m-dim measurable set S. Then the Lebesgue integral of f 
over 5, | f(P) dP, is defined by 


[ £0) aP = tm του  5Ὴ. 
If \s f(P) dP < οὐ, then f is said to be integrable (or summable) over S. 


By Lemma 4.3.1 it is clear that there are integrable functions. For example, 
the constant function f(P) = 1 defined on a set S of finite measure is integrable over 
S and its integral is y,,(S). In this case the least set of ordinates, Q,(/;S) is a 
cylinder in R”** with base S and height 1. 

We have considered in some detail the case where the least set of ordinates is a 
cylinder with a measurable base. We can now extend to the case where the least set 
of ordinates is the union of a finite number of disjoint cylinder sets with 
measurable bases. The corresponding functions f(P) are known as simple functions. 

Let a bounded measurable set δ in R™ have a partition into disjoint measurable 
subsets as follows. 


S=8,US,uU--uUS,, δι δὲ = 9, J#k. (4.3.2) 
It is assumed that f has a constant value on each S,, say 
S(P) = α;. PeS,, | eee (4.3.3) 


Theorem 4.3.2. Αἰ non-negative simple function f as defined by (4.3.2) and (4.3.3) 
is integrable over S and 


[ f(P) dP = Σ αμι,(5)). (4.3.4) 


Proof. Clearly, such a simple function f is measurable. We assume a; > 0, 
j=1,2,...,/. By Lemma 4.3.1, a;u,,(S;) < oo is the (m + 1)-dim measure of the 
jth cylinder with height «; and measurable base S;. As the union of / disjoint 
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measurable cylinders, Q)(f; 5) therefore yields 
l 


[ 102) dP = μιν τον 5} = ¥ αὐμηίδ}) < 00, 


= 
by the countable additivity of un+1- ἢΚὶ 


Formula (4.3.4) may be extended in various directions. First the restriction to 
simple functions may be dropped. A countable partition of S, S = Uj2, δ; each 
5; measurable δ᾽ ὦ δὲ = 9, j #k, can be used, if a convergence condition 
> 1 &jMm(S;) < co is imposed. If the «’s are merely assumed to be real, a; is to be 
replaced by |«,| for 7 =1,2,...,.... 

We have seen that a partition on S introduces an associated partition on 
Q,(/; 5). This suggests other generalizations of (4.3.4), which will show that the 
fact that the components of Q,(/;S) sometimes are all cylinders is really 
immaterial. We present the following 


Theorem 4.3.3. Let S be a bounded measurable set in R™, f(P) a non-negative 
integrable function over 5. Suppose that S = J); 5,» δ) Ὁ δὲ = 0, j # k, is a 
finite or countable infinite partition of S into disjoint measurable subsets. Then f 
is integrable over each of sets S; (more precisely, the restriction of f to S; is 
integrable over S;) and 


[sear =¥ Ϊ “sar. 


Proof. For each j, let f; denote the restriction of f to S,, 1.6. f; = f|S;. It can be 
verified easily that f; is measurable. This implies that for each j, Qo(f;;S,) is 
measurable, by Theorem 4.3.1. Since S$, δὰ = O for j #k, 


On the other hand, Q9(f/; S) = U; Ωρ); δ). By the countable additivity of 


μη» Pm+1LQ0(f3 5} = dj Um +1 1Q0(F;; SJ. Since f is integrable over S, 
Ln+11Q20(f;S)] < co and f,f(P)dP =>; Js, f(P)dP <0. Here, without 
causing any ambiguity, Ss, f(P) dP denotes | s,fj(P) dP, which is obviously finite. a 


Theorem 4.3.4. Let f, and f, be two extended real-valued non-negative 
measurable functions defined on a measurable set S. If f, is integrable over S and 
if f,(P) < f,(P),V P € S, then f, is integrable over S ἀπά ὗς f,(P) dP < [ς f,(P)aP. 


Proof. The conclusion follows from Qo(/,;S) Ξ- Qo(/2; 5) and Lemma 4.1.1. il 


Theorem 4.3.5. (Monotone Convergence Theorem.) Let { f,} be a monotone 
increasing sequence of extended real-valued non-negative measurable functions 
with a measurable set S as common domain. Then \;f(P) dP = lim [ς f,(P) aP, 


where f(P) = lim f,(P). (This limit function always exists. Why?) Furthermore, 
f is integrable over S iff the sequence {\,f,(P) dP} is bounded. 
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Proof. Since f, is measurable for each n, the sequential limit f of {f,! is therefore 
measurable, by Theorem 4.2.2. From f,(P) < f,,,(P) for each n and for each 
P ε δ, δηά from f(P) = lim f,(P), it follows that for each n, Qo( f,3S) < Qo( fi, ει: 5) 


nro 


and Oo(/; 5) = Un=1 Q0(4,; S). Apply Lemma 4.1.2 to infer 


Mm +1 LQo(f; S)] = lim Mm+1LQo0(fai S)I, 


that is, [ς Χ(Ρ) dP = lim |, f,(P) dP. Here |, f(P) dP may be + oo. Clearly it is finite 


noo 


iff the sequence {{5f,(P) dP} is bounded. fj 


Theorem 4.3.6. Let f{(P) be an extended real-valued non-negative measurable 
function defined on a measurable set S. Then there exists a sequence of simple 
functions { f,(P)} such that 


1lhO<f\(P) <fie-.(P), VP ES, VneN. 
2) f(P) = lim f,(P), for each Pe 5. 


Proof. Letn bea fixed natural number. If k = 1,2, ..., 2”, let E,, be the set 
Ε,κ = ἱΡΙΡ ε 5, (k —1)27" < f(P) < k2™"}, 


and if k = (n + 1)2”, let 
Ε,κ = {P|PeS,n<f(P)}. 


We observe that the sets {E,,|k = 1,2, ...,(m + 1)2”} are disjoint with their union 
equal to δ, and the measurability of fimplies the measurability of each E,,. 

A sequence of simple functions {f,(P)} can be constructed by defining 
FCP) = (k — 1)2~" for P in E,,. Clearly, we have f,(P) < f(P) for all P in S and 
all x in N. To prove (1) and (2), we observe first that, if f(P) = k2~", then 
k2-" < f(P) and 0 <1 <k < n2"; hence 2k2™""! < f(P) and 


0< 2k <n2"*1 <(n4+1)2""), 


which by the definition of f,,,(P), implies f,(P) = 2Κ2 "1 < f,,,(P). Now if 
f(P) < oc, it follows from the construction of f#,(P) that n> f(P) implies 
0 < f(P) — f,(P) « 2", and so lim f,(P) = f(P). 


Finally, by the construction of f,(P), if f(P)= +0, then f,(P) =n 
(n =1,2,...), and so (1) and (2) still hold. Jj 


Corollary 1. An extended real-valued non-negative measurable function f is 
integrable over its domain S iff the sequence {J,\ is bounded, where 
(n+1)2" 


J,= >» (k-1)2""p,(E,,)- Further, [f@ dP = lim J,. 
5 


k=1 n> oo 
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Proof. It follows from the previous theorem and from the Monotone Convergence 
Theorem that 


[ {(P) dP = lim | f,(P) dP 
S S 


n> oO 


(n+1)2" 


=lim Yo (k—1)27" pp(En,) 


n>oco k=1 


= lim J,. 


n7o 


Clearly, [ς ΧΡ) dP is finite iff the sequence {F,} is bounded. ff 


Corollary 2. Let « be any real number such that «> 0. If f is an extended 
real-valued non-negative function integrable over the measurable set S, then af is 
integrable over S and ὗς (af)(P) dP = α {5 f(P) aP. 


Proof. Jt is clear that af is measurable. (Cf. Lemma 4.2.1.) If « =0, then 
af = 0, Ὡρ(αΐ; S) = O and 


| (af)(P) AP = py « {£Qo(af: SY] = pms 1B) = 0 = a | ἌΡ) aP. 


Assume now α» Ὁ. Apply the previous theorem to infer that there exists a 
monotone increasing sequence of simple functions {f,(P)} such that 
lim f,(P) = f/(P) for each Pe S. Then {af,(P)} is a monotone increasing sequence 


of simple functions such that {af,(P)} converges to (a@f)(P). It follows from 
Theorems 4.3.5 and 4.3.2 that «fis integrable over S and 


[ (af (P) dP = lim | (af,)(P) dP 


= lim « [ f,(P) dP 


n> oO 


—alim | f£(P)dP=« Ϊ f(P)aP.§j 
5 5 


no 


Corollary 3. If f and g are real-valued non-negative functions both integrable 
over the measurable set S, then f + g is integrable over S and 


[r+ oar ={ rear + | ocryar. 


Proof. By Lemma 4.2.2, f+ g is measurable. If {f,} and {g,} are monotone 
increasing sequences of simple functions converging to f and g, respectively, then 
{f, + 9,} is a monotone increasing sequence of simple functions converging to 
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f+. It follows from Theorem 4.3.2 and the Monotone Convergence Theorem 
that f+ g is integrable over S and 


[ (f + 9)(P) dP = lim [, (f, + 9,)(P) dP 


m— cO 


lim [πῶ dP + g,(P) aP | 


n-> 0c 


| 


lim I f,(P) dP + lim | g,(P) dP 
S 


n> oo noo J/S 


I 


Ϊ f(P) dP + Ι 4(Ρ)4Ρ. ἢ 


We shall now discuss the integration of extended real-valued measurable 
functions which may take on both positive and negative values. Let fbe an extended 
real-valued function. In Section 4.2 we have set f=f*—f~ with 
f*(P) = max { f(P), 0} and 2, (P) = max {—/(P), 0}. This decomposition of f 
into the difference of two extended real-valued non-negative functions suggests the 
following. 


Definition 4.3.3. An extended real-valued measurable function f(P) defined on 
a measurable set S is integrable over S iff f * (P) and f (ΡῈ are integrable over 5. 
In this case, we define its Lebesgue integral over S to be 


[ sear =[ pear -[ far. 


Although the integral of f is defined to be the difference of the integrals of 
f*, f°, we remark that if f=/, —f;, where f,,f, are any non-negative 
measurable functions with finite integrals over S, then 


[ sav -[λῷ dP - | pcyar. 


In fact, since f* —f" =f=f, —/fy, it follows that f* +f, =f, +f7: Apply 
Corollary 3 to Theorem 4.3.6 to infer that 


[s7@ dP + [πῶ dP = [λῷ dP + | ff (P) dP. 
S S 5 5 
Since all these terms are finite, the following is obtained: 


[70 dP = Ϊ (Py dP — Ϊ f-(P) aP -[λῷ dP — Ϊ 2 (6). 
S 5 5 S 5 


The next result is sometimes referred to as the property of absolute integrability 
of the Lebesgue integral. 
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Theorem 4.3.7. A real-valued measurable function f defined on a measurable 
set S is integrable over S iff |f\ is integrable over S. In this case 


[re dP| < | | f(P)| 4}, (4.3.5) 


where the equality holds iff f(P) has the same sign almost everywhere on S. 


Proof. By definition f is integrable over S iff f* and f~ are integrable over S. 
Since |f|* = |f|=f*74+°, [717 =0, O<f* <|f| and 0 « ,΄ <|/|, the 
first part of the assertion follows from Theorem 4.3.4 and Corollary 3 to Theorem 
4.3.6. Moreover, 


[ear] Ξ ΩΣ, - [ear 


< | sap + | sar =| 10). 
S 5 S 
For the proof of the statement on equality we shall find the following helpful. 


Theorem 4.3.8. Let f be a real-valued non-negative function defined and 
integrable on a measurable set S. Then the integral of f over S is positive iff f is 
positive in a subset of S of positive measure. 


Proof. Suppose that the measurable set ST = {P|P eS, f(P) > 0} has positive 
measure. For every integer 1 define the set 


Ε, = {P|PeS, 251 < f(P) < 2"). 


We observe that the sets E, (n =0, +1, +2, ...) are disjoint measurable sets with 
their union equal to S. The countable additivity of the measure implies 
Un(S') Ξ ΣΙ 6 Um(E,). This is a convergent series of non-negative terms with a 
positive sum. Hence at least one term is positive, say y,,(E,) > 0. Then 
fs f(P) dP = fs. f(P) dP > Je, f(P) dP > 2*~* yw, (E,) > 0 as asserted. On the 
other hand, if all terms in y,,(S*) = ><”. U»(E,) are zero, then the least ordinate 
set associated with the restriction of f to E, is a subset of the cylinder set with base 
E, and height 2”. Since y,,(E,) = 0, it follows from Lemma 4.3.1 that the 
(m + 1)-dim measure of such a cylinder set is 0. Hence μ,,....{Ὡρί(; S)] = 0. Thus 
the condition is necessary as well as sufficient. i 


We are now ready to complete the proof of Theorem 4.3.7. 


Put {,f*(P)dP =a, {sf-(P)dP =f. Put S* = {P|PeS, f(P) > 0}, 
S~ = {P|PeS, f(P) < 0}. Then the right member of (4.3.5) is a + B while the 
left is either α -- B or B-—a. Nowa-—fP=a+4+f iff B=0; B-—a=ac B iff 
α Ξε 0. By Theorem 4.3.8, β = 0 ΠΊμ,,( 5) = 0, i.e. f(P) is 3:0 almost everywhere 
on δ; «= 0 iff u,,(S*) = 0, ie. f(P) is «Ὁ almost everywhere on S. ἢ 
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The linearity of the Lebesgue integration should become clear through the 
following 


Theorem 4.3.9. Let « be any real number. If f and g are real-valued functions 


defined and integrable over a measurable set S, then af and f + g are integrable 
over S with | 


| (af)(P) aP = « Ϊ ΚΡ) aP, 
5 5 


] L/(P) + g(P)] dP = [ 0) dP + [ g(P) dP. 


Proof. The measurability of functions αὐ and f + g is established in Lemma 4.2.2. 
If « = 0, then af = O everywhere so that 


] af(P) AP = pm £Qo(2f; 5}] = ms (Ὁ) 


=0= a FS (P) dP. 
Ια > 0, then (af)* = af* and (af)~ = af, whence 
| (af)(P) dP - | af *(P) dP - | af (P) dP. 
S S 5 
By Corollary 3 to Theorem 4.3.6 
[ΠΡ =a 7 (Pyar, [or @ar =a] s-@yar 
5 5 5 5 


ἘΣ [nw dP = «| £°) dP — «| -) dP = a| £0) dP. 


If «<0, let «= —B with B>0. Then (af)* = (Bf) = Bf” and (af) = 
(Bf)* = Bf*. Thus 
[ (af)(P) dP = Ι (af)*(P) dP -- [ (7). (P) dP 
S 5 5 
Ξ ᾿ (Bf ΧΡ) dP — ! (Bf *)(P) aP 


=p| s-(Pyar - p| s*(yap 


= —p| scPyap =a ΧΡ)». 
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If fand g are integrable over δ, so are | f| and |g|. Since | f+ 4] « {{Π] + 6], 
it follows from Corollary 3 to Theorem 4.3.6 and from Theorem 4.3.4 that f+ g 
is integrable. To establish the desired relation, we observe that 


f+g=(f'+9°)-C +97). 


Since f* + g* and κ΄ +g are non-negative integrable functions, it follows from 
the remark made after Definition 4.3.3 that 


[r+ ocprar = | Ut + σ'ΧΡ)αρ — | (F> Ὁ a (Par, 


Using Corollary 3 to Theorem 4.3.6, we obtain 


[urs apap =| f*(pyap—| f-(ryap + | grap — |g @ya 
= ] f(P) dP + [oe dP. ἢ 


said to be measurable iff both the real part, g(P), and the imaginary part, Π(ΡῚ), of 


Recall = a complex-valued function f(P) defined on a measurable set S is 
J(P) are measurable. Similarly, we have 


Definition 4.3.4. A complex-valued measurable function f defined on a 
measurable set S is said to be integrable over S iff both g and ἢ are integrable over 
δ. In this case, we define the Lebesgue integral of f{(P) = g(P) + ih(P) to be 


[re dP = [o@ dP + i| WP) dP. 


With this definition in mind, we can easily extend the entire theory of 
integration to complex-valued measurable functions. For example: 


Theorem 4.3.10. A complex-valued function S(P) is integrable over the 
measurable set S iff | f(P)| is integrable over S, and then 


< | if(PiaP (4.3.6) 


[ fear 


with equality iff there exists a constant c of unit modulus such that f = c|f| 
almost everywhere on S. 


Proof. Since f = g + ih, |f| = Vg? + h? < |g| + Al. If fis integrable over S, 
then |g| and [᾿] are integrable over δ. Consequently, |g| + |A| is integrable over S 
by Corollary 3 to Theorem 4.3.6. On the other hand, as the non-negative square 
root of a non-negative measurable function, | f| is measurable. (Cf. Exercise 4.2.7.) 
From |f| « |g| + |A| and from Theorem 4.3.4 it follows that | | is integrable over S. 

Conversely, suppose that | f| is integrable over δ. From |g| < «(σῇ + h2 = | f|, 
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|Al « 932 +h? =|f|, and from Theorem 4.3.4 it follows that g and ἢ are 
integrable over δ. This implies f is integrable over S. 

To prove (4.3.6), suppose [ς f(P) dP = re’® with r, @ real; consider the function 
e "f(P). Since | f(P)| = |e~f(P)| is integrable over S, so is e~“f(P). Let G(P) 
and H(P) be the real part and imaginary part of e~f(P) respectively. Then 


ο΄ '°f(P) = (cos 8 — isin 0@)[g(P) + ih(P)] 
= g(P)cos θ + h(P) sin 6 + i[h(P) cos θ — g(P) sin 0] 
= G(P) + iH(P). (4.3.7) 
This implies 


[ e~"f(P) dP = [ [g(P) cos 0 + A(P) sin 0] dP + if [A(P) cos θ — g(P) sin 0] dP. 
Ss 5 5 
By the linearity of the integral (cf. Theorem 4.3.10), 
| e~°f(P) dP = cos 0 [g(P) + ih(P)] dP — isin 9] [g(P) + ih(P)] dP 
5 5 5 


= (φο5θ -- isin θ) re” = r. 
On the other hand, 


! ο΄ 9.(Ρ}4Ρ = ! G(P) dP +i [ H(P) dP. 
This implies that r = [ς G(P) dP. 
Since | f(P)| = |e~@f(P)| = VG?(P) + H7(P) > |G(P)|, 
[rear = | te“ Pyar = [ <@@ + #@ ap 


>| |G(P)|dP 


> | Ι! G(P) ar| =p | Ϊ f(P) aP| (4.3.8) 


We have established (4.3.6). 
To prove the statement on equality, first we assume f = εἰ 7] for some complex 
number ὁ = a + ib with |c| =1. Then 


[ sevar = [αἰ γὰρ +i [ ouri@yar 


I 


(a + i) | |) dP 


Γ Ϊ [71(Ρ) aP. 
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Thus 


[rear =el{ irra =| iri ar 


Conversely, let us assume [[ς Χ(Ρ) dP| = Js|f(P)|dP. From (4.3.8) it follows that 


{ 1ocyar =| f 6c ae (4.3.9) 
S 5 


and 


Ϊ J/G?(P) + H?(P) dP -- Ϊ |G(P)|dP 


7 [ [VG(P) + H?(P) — VG(P)] aP = 0. (4.3.10) 
S 


By Theorem 4.3.7, (4.3.9) yields G(P) > 0 a.e. on S or G(P) « 0 ae. on S. By 


Theorem 4.3.8, (4.3.10) yields H(P) = 0 a.e. on 5 and VG?(P) + H?(P) = VG?(P) 
a.e.on S. 
If G(P) > 0 4.6. on δ, then 


| f(P)| = VG?(P) + H?(P) = G(P) ae. on δ. (4.3.11) 
From (4.3.7) and from H(P) = 0 ae. on S, it follows that 


H(P) = h(P)cos θ — g(P)sin@6=0 a.e.onS 
and 
G(P) = g(P) cos 6 + A(P) sin 6 


= (cos 8 — isin @)[g(P) + ih(P)] ae. onS 
=e “f(P). (4.3.12) 
Combine (4.3.11) and (4.3.12) to infer 


εἰ f(P)| = f(P). 
The case of G(P) « 0a.e. on S can be handled similarly and equally easily. fi 


Applying the notion of almost everywhere, which was first introduced in 
Section 4.1, we may define an equivalence relation on the set of all extended real- 
valued measurable functions defined on the measurable set S. Two such functions 
f(P) and g(P) are said to be equivalent, denoted by f ~ g, iff f and g are equal 
Hm-almost everywhere on S, i.e. u,,{P;f(P) # g(P)} = 0. It can be verified easily 
that (i) f ~ f, (ii) f ~ g iff g ~ f, (ili) f ~ g and g ~ ἢ implies f ~ h. We note that 
f ~g implies |, f(P) dP = |, g(P) dP if either side exists. This result is important 
in the next section. 

The last range of ideas to be discussed in this section centers around the 
theorem of Fubini (1907), which will be stated without a proof. Suppose that f(P) 
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is a non-negative function integrable over the measurable set S. Without loss of 
generality, we may extend the definition of f(P) by setting f(P) = Ofor Pe R" Θ S. 
The extended function is clearly measurable and its least set of ordinates has the 
same measure so the integral is unchanged. Now [πόδι (P) dP has a meaning and the 
question arises: Does the repeated integral 


have a meaning, preferably one which is independent of the order of the integration 
and which agrees with [,.f(P)dP? An answer to this question is given by the 
theorem of Guido Fubini (1879-1943), which we formulate as follows. 


Theorem 4.3.11. If f(P) is non-negative integrable over R™, then there exists 
at least one function g(P) with O < g(P) < f(P) and g ~ f, such that all repeated 
integrals 


00 ou 
| ~{ G(X 15 X25 ..., Xq) AX, AX +++ dX, 


for any order of integration, exist and are equal, the common value being that of 
Jan f(P) dP. Conversely, the existence of such a function g(P), for which one of 
the repeated integrals exists and has a finite value, is sufficient for the integrability 
of f(P) over R™, provided f(P) is measurable. 


EXERCISE 4.3 


1. Let f(P) be a non-negative function defined on a set Bin R”. Show that the outer 
measure of Q.( f; B) is equal to that of O,(f; B), ie. WQS B= WQS 811. 


2. Let { f,(P)} be a monotone increasing sequence of non-negative functions defined 
on a set Bin R”. Let f(P) = lim f,(P) for each P in B. Show that 


n> oo 


Ωρ B) = U OQo(f,; 8). 


3. For an arbitrary subset S of R"*', define the Lebesgue inner measure Lty(S) of S by 
taking p,(S) = sup {,,4,(E)|E is measurable and Ec S}. Prove that 


(a) If S is measurable, then w*(S) = p,(S). 
(b) If μ.,(9) = μ(9) < οὐ, then S is measurable. 


4. Show that the sum, scalar multiple, and product of simple functions are simple 
functions. 


5. Show that the Monotone Convergence Theorem need not hold for a decreasing 
sequence of non-negative measurable functions. 
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6. Use the Monotone Convergence Theorem to prove Fatou’s Lemma, which is very 
useful for it enables us to handle sequences of functions that are not monotone. 
Fatou’s Lemma: Let {f,} be a sequence of extended real-valued non-negative 
measurable functions (with S as the same domain of definition), then 


[aim inf f,) dP < lim inf Ϊ St, AP. 
5 5 


7. Use Fatou’s Lemma and Theorem 4.3.10 to establish the following Lebesgue 
Dominated Convergence Theorem: Let { f,} be a sequence of integrable functions 
which converges to a real-valued measurable function f almost everywhere on JS over 
which each /, is integrable. If there exists an integrable (over S) function g such that 
[|| <g for all n, then f is integrable over S and [5 f(P) dP = lim [ς f,(P) dP. 
[Hint: | f,| < g implies g + f, > Oandg — f, > ΟἹ. 


8. If an extended real-valued function f(P) is integrable over S, show that / is finite 
almost everywhere in S, that is, the subsets f~ 1(00) and fos oo) are of measure 0.. 


9. Suppose that for each n, f, is an extended real-valued function integrable over S. 
Suppose further 1°", Js|f,(P)ldP < 00. Prove that the series Σοῦ, f,(P) 
converges almost everywhere (on S) to a real-valued integrable function Καὶ Moreover, 


Ι f(P)dP = Σ᾽ | F,(P) dP. 
S n=1 JS 
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After this discussion of Lebesgue measure and integration we proceed to a survey 
of the linear vector spaces formed by equivalence classes of functions integrable 
over a measurable set Sin R”. Usually the set S plays only a minor role and may 
be taken as all of R™. 

We consider vector spaces L,(S), where | < p < οὐ. For p finite the elements 
of one of these spaces are the equivalence classes of functions measurable over S 
such that 


i= | yeryraP| (4.4.1) 


exists as a finite number. The pth root is extracted to obtain homogeneity since it 
is planned to use this expression as the norm of fin L,. The case p = oo is deferred 
for the present. 

We start with p=1. Any function P—/f(P) which is measurable and 
integrable over S defines a class of equivalent functions. We recall that g is 
equivalent to f, if f(P) =g(P) for almost all P, 1.6. except for a set of 
m-dimensional measure zero. The equivalence classes are the elements of L,(S), 
usually written L(S). Each equivalence class is represented by any one of its 
elements and the representative need be defined only almost everywhere. 
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Operations on equivalence classes are defined by operations on representative 
elements. If fand g define two equivalence classes { f} and {g}, then f+ g defines 
an equivalence class {f+ g} which, by definition, is the sum of the equivalence 
classes {f} and {g}. Similarly, we set αἱ 7} = {af}. Since f+ g and of are 
measurable and, by Theorem 4.3.9, integrable over S, the set L(S) 15 a linear vector 
space under this definition of addition and scalar multiplication. 

We introduce a norm in L(S) by (4.4.1) with p = 1. It is to be thought of as the 
norm of the equivalence class, though usually we omit the braces. It is clear that 
this convention leads to a norm. It is non-negative and zero iff f ~ 0. Properties 
(N,) and (N;) are also obvious. It will be shown later that L(S) is complete under 
this norm. 

Suppose now that p is fixed, 1 <p< oo. The elements of L,(S) are the 
equivalence classes of functions f measurable over S for which the integral in 
(4.4.1) has a finite value. The definitions of addition and scalar multiplication are 
unchanged. While af obviously is in L,(S), it is not clear that f+ g always has this 
property. It is required to prove that if f and g generate equivalence classes in 
L,(S), so does f+ g. This follows from the analogue of (3.1.12), 


| FCP) + g(P)I? < ΠΚΡῚ + lg (PIL? « 2 FCP)? + Ισ(ΡῊ}], (4.4.2) 


provided f and g are both defined and finite. Here the third member is the sum of 
two integrable functions and hence integrable. By Theorem 4.3.4 the first member 
is also integrable. Hence L,(S) is a linear vector space under the adopted definition 
of addition and multiplication. In this case the triangle inequality does not come 
out as a byproduct of linearity. We shall require analogues of the theorems of 
Hélder and Minkowski (for /,) in order to prove that (4.4.1) actually defines a 
norm. This will be done later. 
L,(S) and L,(S) are said to be conjugate Lebesgue spaces over S if 


—+—=]. (4.4.3) 
ρ 4 
This makes L, self-conjugate. Note that the relation of conjugation is an 
involution: the conjugate of the conjugate space is the original space. 

We have still to define L,(S). Its elements shall be the equivalence classes 
of measurable “essentially bounded” functions. A function fis essentially bounded 
if it is equivalent to a bounded function. Let the essential supremum of f, written 
ess sup | f|, be the infimum of the suprema of the absolute values of |g(P)| for all g 
belonging to the equivalence class determined by Κα We set 


fll = ess sup | f(P)I. (4.4.4) 


It is easily seen that this is an acceptable norm. The space L,,(S) is regarded as 
the conjugate space of L,(S), and vice versa. Note that 


Flo = inf sup |g(P)I. (4.4.5) 
ge{f} PeS 
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In passing we ask: Could any Lebesgue space: be an algebra? Under the usual 
definition of multiplication of elements only L,, is an algebra. For finite values of p 
the product in the ordinary sense of two elements of L,(S) usually fails to be in 
L,(S). Thus t > ἢ“ is in L,(0, 1) if cp < 1, while its square is in L, iff 2cp < 1. 

In L,(R™) there is, however, an alternate mode of defining a product in terms 
of which the space becomes an algebra. The important operation of convolution 
leads to such a product. We shall sketch the argument which is based on the Fubini 
theorem. Take two elements f and g in L(R”) and form 


(feoP =| ΛΟ) 90 -- 2) a0. (4.4.6) 


Here 


S(Q) g(P a Q) = f (51, 52, sey Sin) GCE — $1, lf, — 8), οὐ» Ly — Sm) 


and the integration is carried out with respect to the s-variables over R”. Here it 
may be questioned if the integral exists at least for almost all P, and if so, the 
resulting function is an element of L(R”). The answer is in the affirmative on both 
counts and this follows from Theorem 4.3.11. This shows that the integral of 
S(Q) g(P — ΟἹ over Κα x R™ exists at least if f and g are replaced by suitable 
equivalent functions. Moreover, 


[fi @iler - oiardo=f iroiao-[ igen ar, 


so that the star product is well defined and is an element of L(R”). Further, 


Ife gl, « 1.4 lols (4.4.7) 


so that L(R™) forms a normed algebra under convolution. It is a commutative 
algebra for the product does not depend upon the order of the factors. This is 
shown by a change of variables in (4.4.6). 

To prove that the proposed norm (4.4.1) has the triangle property we proceed 
as in Section 3.1 and start with Hélder’s inequality for integrals. 


Theorem 4.4.1. If f and g belong to conjugate Lebesgue spaces over S, say 
feEL,(S), g € L,(S), 1 < p < οὐ then fg € L,(S) and 


1/p 1/q 
[ W@i lg aP < [| irceneap| "| | igcryie ap} (4.4.8) 
with equality iff there exists a fixed constant k > 0 such that for almost all P we 
have 
.(ΡῊὴ} = kig(P)I*. (4.4.9) 


Proof. We may assume that the integrals in the right member of (4.4.8) are 40 
since otherwise f or g or both are equivalent to zero and there is nothing to prove. 
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Formula (3.1.16) gives 


] I 
If(P)| Ισ(ΡῊ < τῆ (ΡῈ + a 9(Ρ) 


for almost all P and hence by Theorem 4.3.4 


| Ι Ι 
[ remigmiar <4 [ ismyrar ++ f igerar. (4.4.10) 
Js P JS 4 Js 


This shows that fg ¢L, and the inequality is the analogue of (3.1.17). Similarly, 
the analogue of (3.1.18) is 


l Ι 
[irr σὴ de < wr [χρη αρ + we { ig(yrar. (44.11 
5 ρ 5 q JS 


This function of w has a unique minimum attained for 


1/(p+q) 1) —1/(p+4q) 
w= Ϊ | f(y aP | Ϊ σ(Ργ"4Ρ] (4.4.12) 
5 S . 


Substitution of this value of w in (4.4.10) gives (4.4.8). In order to get equality 
there, we must have equality in (4.4.11) for some value of w and this requires that 
(4.4.9) holds for some fixed k. i 


The special case p = 2 is known as the Bounyakovsky—Schwarz inequality 
after Viktor Jakovlevié Bounyakovsky (1804-89) and Hermann Amandus Schwarz 
(1843-1921), the former professor at the University of St. Petersburg (now 
Leningrad), the latter professor at the University of Berlin (now the Humboldt 
Universitat) where he had been a pupil of Karl Weierstrass and became his 
successor. . 

Minkowski’s inequality for integrals follows, again in the same manner as in /,. 


Theorem 4.4.2. If f and g belong to L,(S), so does f + g and 


| [ir (P) + g(PyP aP|"< [ irr a] "+ i (Py dP |” (4.4.13) 


e 


Proof. It is only the inequality that must be proved; the integrals are already 
known to exist. We have 


CAP) + [σ(ΡῊ}]} = ΠΑΧΡῚῊ + le(POO SCP) + Ισ(ΡῊ}1} 1. 
Next 


{ ΧΡῊ ΠΙΚΡῊ + Ιφ(Ρ}]Ρ AP 


«[{ ΠΌΧΩΝ [ὐρὶ + ιν» 4Ρ) “ 


Here we interchange 7} and g, add the results and simplify to obtain (4.4.13). a 
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Thus it has been shown that (4.4.1) defines a norm in L,(S). 
There are various consequences of (4.4.8) where until further notice S 15 
assumed to be bounded so that y,,(S) is finite. Then for all p > 1 


L,(S) «- L,(S) (4.4.14) 


and it is fairly easy to see that the inclusion is proper. To prove (4.4.14), observe 
that the function 1(P) which is identically one in S belongs to L,(S) for all g where 
p and q satisfy (4.4.3). Since the g-norm of 1(P) is [u,,(S)]'4, it is seen that 


1.7], < Cem (SI? IF lly (4.4.15) 


We can rewrite this and similar inequalities in a more elegant form by 
introducing the pth mean of f over S, 


᾿ 1/p 
M0/1 = {{μ,(5}1- [ L/P) aP | | (4.4.16) 


We then get 
M,C/]<M,[/] (4.4.17) 


with equality iff fis a constant. More generally 
Μ,1.] « Μ,1.], α < 4, (4.4.18) 


again with equality iff 5 is (equivalent to) a constant. 

Once more, let y,,(S) be finite and let S, be a measurable subset of δ. Consider 
the characteristic function of S,. This is the function which equals one in the set S, 
and is zero elsewhere. It is clearly an element of L,(S) for any q >1 and (4.4.8) 
gives 


| (PAP « [Ss] I lp (4.4.19) 


where the norm is with respect to L,(S). This inequality has an interesting implica- 
tion. The left member of (4.4.19) is a set function defined on any measurable subset 
of S and for any element of L,(S). All we can say in general is that this non- 
negative function of sets goes to zero with the measure of the set. On the subspace 
L,(S) the sharper estimate (4.4.19) holds and this is the best possible in the sense 
that the exponent 1/q can be replaced by no larger number. See Problem 6, 
Exercise 4.4. 

We have introduced a metric in each of the spaces L,. It remains to prove that 
the space is complete in its metric. We no longer insist that y,,(S) be finite. 


Theorem 4.4.3. The space L,(S) is complete with respect to the metric defined 
by 
Af, 9) = 1}. -- gli. (4.4.20) 
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Proof. Suppose that a Cauchy sequence is given. There exists then a subsequence 
{ f,} such that 


fall: + Σ Wier —fila SA <0. (4.4.21) 


Consider the infinite series 


fo(P)| + y fera(P) — ΧΡῊ = F(P) (4.4.22) 


with the partial sums 
Fy(P) = Lfo(P) ἘΣ Vas s(P) -- ΛΟ. 


The sequence {F,(P)} is non-decreasing and made up of non-negative elements. 
Further, 


lim | F,(P)dP = A. 
n>o JS 


By Theorem 4.3.5 the series (4.4.22) converges for almost all P and defines an 
element of L,(S). 
Consider the series 


folP) ἘΣ Ufesi(P) — API = SP) (4.4.23) 


obtained by omitting the absolute value signs in (4.4.22). This series is absolutely 
convergent for almost all P and for such points 


|f(P)| < F(P) 
so that f(P)e L,(S). Moreover, 


IF—-Al, < |F -F,],-0 as n- oo. (4.4.24) 


What we have proved so far is that the given Cauchy sequence contains a 
subsequence which converges almost everywhere to a limit which is in L,(S). If 
the original Cauchy sequence is {g,}, it is to be proved that 


an If — g/l, = 0. (4.4.25) 
Suppose that 
h= Ikn)> 


where k(n) is a strictly increasing sequence of positive integers. We can choose 7 so 
large that for a given ¢ > 0 


IF — falls = IF Quella < 48. 


142 LEBESGUE SPACES 4.4 


If now k(n) « k < k(n +1), we may assume that 
Ge — Gunylla < 48 
since {g,} is a Cauchy sequence. This gives 
IF -- Gellar < 1} -- Geemlla + θκω — Gellar < 48 + 4€ = €. 
This proves (4.4.25) and shows that L,(S) is complete in the normed metric. Ε 
A similar result holds for 1 < p < οὐ. We shall only sketch the proof. 


Theorem 4.4.4. The space L,(S) is complete under the norm defined by (4.4.1) 
forl|<p<o. 


Proof. We can restrict ourselves to the case where S is bounded. If {σι} is the 
given Cauchy sequence, we can find a subsequence { f,} such that 


[oll ἘΣ Wart τ Sully = A < 0. (4.4.26) 
As above we set 
0 (ΡῊ ἘΣ 1.0) — fo(P)| = FO) (4.4.27) 


when the series converges. Then by (4.4.19) and (4.4.26) 


{ F(P) dP < [μ,(5}}}4, 
S 


whence it follows that the series (4.4.27) converges almost everywhere. Thus F is 
measurable and 
|F ll, < A. 
Next we set 


οΌ 


IP) FD ii) =P) Hf). (4.4.28) 


n=0 


Again this series converges absolutely almost everywhere in S and its sum is 
dominated by F(P) which is in L,(S). It follows that f(P) is also in L,(S). 
Furthermore, the statement that the series converges to fin the pth norm becomes 


lim ||f —fillp = 0. (4.4.29) 


n> oO 


The proof is then completed as in the case p = 1. Η 


Theorem 4.4.5. The space 1,.. (8) is complete under the normed metric defined 
by (4.4.4). 


Proof. As above, we choose from the given Cauchy sequence {g,} a subsequence 
{ f,} satisfying a convergence condition. In this case it is assumed that 


fier —Sillee = @, with Xa, < oo. (4.4.30) 
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We then consider the series 
folP ἘΣ Vurs(P) — flPIl = FOP). (4.4.31) 


The terms of this series are measurable functions of P on S and each term is 
equivalent to a bounded function. We proceed to replace the terms of the series by 
equivalent bounded functions. Here f,(P) is replaced by an equivalent function 
f,(P) subject to the following conditions: 


1) f,+,(P) — f,(P) is bounded in S and its supnorm is <a,. 
2) farsi (P) — FP) = f.41(P) — f,(P) except in a set E, of measure zero. 


The union of these sets >) E, = Eis also a set of measure zero. We now form 


F(P) = |Fo(P) ἘΣ 11.100) — Su(P) (4.4.32) 


and note that 
F(P)'= F(P) 


outside the set E. The series in (4.4.32) converges uniformly in S including E. 
Without restricting the generality we may assume that f,(P) = 0 for P ε Eand all n. 
Thus F is a bounded measurable function and hence defines an equivalence class in 
L,,(S). Further, 


lim f,(P) = f(P) 


n7>oo 


exists uniformly in S and 


lim f,(P) = ΧΡ) 


exists at least for P not in E. Moreover, | f(P)| < | f(P)]. Thus /(P) is equivalent 
to a measurable bounded function on S, 1.6. f defines an equivalence class in 
1,. (5). We have 

ess sup | f(P)| < ess sup F(P) < La,. 
Further, 


im | f—frllo = 9 


noo 


and an adaptation of the argument used for p = 1 shows that this limit relation may 
be strengthened to 


πὶ If -- σε = 9. 


We know that simple functions are dense in L,(S) at least for p=1. An 
examination of the proof of Theorem 4.3.6 shows that this assertion holds also for 
I< p< oo. What is more significant, however, is 


Theorem 4.4.6. Continuous functions are dense in L,(S) for 1 < p < o. 
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Proof. We start with the case p = 1. The assertion is that if fe L,(S), then there 
exists a sequence of continuous functions { f,} such that 


lim || f— fl, = 0. (4.4.33) 
We can make several preliminary reductions of this problem. We can restrict our- 
selves to bounded non-negative functions f vanishing outside a large but bounded 
m-dim block B. The norm can be taken with respect to L,(B). Next we know that 
fcan be approximated arbitrarily closely by a simple non-negative function defined 
in B. Hence it is enough to prove (4.4.33) when fis a simple function. Now every 
simple function is of the form 


g(P) = Σα;χ(Ρ, S;), (4.4.34) 


where the a’s are positive numbers (since g > 0) and y(P, 5) is the characteristic 
function of the set S,;, the subset of B where g takes the value «;. This means that 
the problem is reduced to proving (4.4.33) when / is a characteristic function, say 
f(P) = x(P, E), where E is a measurable subset of B. Here we are back to the 
elements of measure theory. From the definition of outer measure we conclude that 
we can find a finite partial covering E* of E with the following properties: (i) ΕἼ is 
the union of a finite number of disjoint blocks (they may have boundary points in 
common); (ii) μ,(Ε Θ E*) + u,,(E* © E) < &, a preassigned positive number. 
It is clear that the L,-norm of 


x(P, E) = xP, ΕΠ 


is <e. Now the characteristic function of E* is the sum of the characteristic func- 
tions of the constituent blocks since they are disjoint. This means that all we have 
to do is to prove (4.4.33) when fis the characteristic function of a block in B. 

Suppose the block is By and let f(P) = x(P; By), Pe B. Let ες > 0 be given and 
define a continuous function f, as follows. Let B, be a block concentric with By and 
with edges parallel to those of By obtained from By by a similitude with respect to 
the center Py with ratio (1 + 8): 1. Draw an arbitrary ray from P, and mark the 
points Qy and Q, where the ray meets the boundaries of By and B,, respectively. 
Let P be any point on this ray and define f,(P) to have the value 1 if P lies between 
P, and Qo, the value 0 for P outside of B,, and let f, vary linearly for P between Qy 
and Q,. Here P > f,(P) is obviously continuous on each ray from P = Po, and it 15 
not hard to see that it is not necessary to restrict oneself to a ray and f, is a 
continuous function of P in B. Further, 


If -- li < Um(B. © Bo), 


and this is O(e) uniformly for all blocks in B. Retracing the steps, we see that any 
element of L,(S) can be approximated arbitrarily closely by a continuous function. 

If S is bounded, a continuous function f, which approximates fin the L,-metric 
will have the same property with respect to the L,-metric. If S is unbounded we 
first replace f by g,, where g,(P) = f(P) or 0 according as the distance of P from the 
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origin is <k or >k. For large values of k the L,-norm of f — g, is arbitrarily small 
and we can proceed as above. ἢ 


The elements of L,(S) have certain generalized continuity properties which 
lead to a definition of a Lebesgue modulus of continuity. The definition requires 
that fis defined for all P, so we assume that S = R™. The reader will find another 
alternative in Section 4.5. If H ε R™, we can consider 


1 
g(H) = {| |f(P + H) —f(P)|P dP). (4.4.35) 


This is a bounded function of H and 0 < g(H) « 2||f |», For a given positive 
number ἢ we define 


μρ(;7) = sup g(H), (4.4.36) 


where H ranges over the sphere ||H|| < A in R™. 


Theorem 4.4.7. The Lebesgue p-modulus of continuity of f is a bounded, non- 
negative, non-decreasing continuous function of h such that 


lim p,(h;f) = 0 (4.4.37) 
h\O 


and 
Uy(hy +hy;f) < Uy(hy Fe is Uy(h2; 7). (4.4.38) 


Proof. It is clear that p,(h; 77) is well defined and positive for 0 < h unless f is a 
constant on its compact support. Further, 


μρίμι; 7) <uphoasf), if O< hy < hy, (4.4.39) 


since the set of H’s which have to be taken into account in computing the left 
member form a proper subset of those relevant for the right member. Since 


fC + H, + A.) -- 7(}}, «ΠΧ + A, + H2) -- 7 + A,)||, 
+ NPC) =F Ol, 
= IFC + H2) -—fOl, + IFC + A) - SOM, 

formula (4.4.38) results by taking the appropriate suprema. 

It remains to prove the continuity and (4.4.37). The latter relation is trivial if 
f is continuous and of compact support. But by the preceding theorem we can 
always approximate an fe L,(S) in the sense of the norm by a function f, which is 
continuous and of compact support and this implies (4.4.37) for all f. 

We have now by (4.4.38) for any ἢ and t,0 <h <t, 


0 < u,(t; 3) — u(t — As f) < μιὰ; 2) 19, 
O<u,(t+h;f) — u(t; 7) < uh; f) 109, 


ash|0. Since u,(t; f) is non-decreasing, these inequalities imply continuity. See 
Section 3.2 for the ordinary modulus of continuity, formula (3.2.11). Jj 
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Our next topic is linear functionals on L,(S). The discussion is attached to 
formula (4.4.8). It says that if fe L,(S), g € iL «(5). where 1 < p < and L, and 
L, are conjugate spaces, then 


fails < WF, lal. (4.4.40) 


Hence, for any fixed choice of ge L, 
(fay =| s(P) 9(P)aP (4.4.41) 


defines a linear bounded functional on L,(S) with bound ||g||, so that 


KAO S Malla WF };»: (4.4.42) 


Here ||g||, is the best possible bound, 1.6. it gives the norm of the functional. This 
follows from the fact that it is possible to choose fin L, so that 


i) f(P) g(P) = |F(P)| lg (P)|, and 
1) there is equality in (4.4.42). 


To satisfy (11) it is necessary and sufficient that | f(P)|? equals a constant multiple 
of |g(P)|? for almost all P. These conditions determine / uniquely up to a constant 
multiplier. 

Actually all linear bounded functionals on L,(S) are of this form, that is, if x* 
is such a functional, then there exists g Ε L, such that 


x*[f] = [τὸ σ(Ρ)4Ρ. (4.4.43) 


For p = 2 this was proved by M. Fréchet in 1907; the general case is due to F. Riesz 
in 1910. The formula still makes sense if fe L, and g Ε L,,, or vice versa. In the first 
case, it was proved by H. Steinhaus in 1918 that every linear bounded functional on 
L,(S) is given by (4.4.43) with ἃ g e L,,(S) and the norm of the functional is ||g|| ὦ 
In the second case where fe L,(S) and géL,(S) the formula defines a linear 
bounded functional on L,, of norm ||g||,, but this is not the most general functional 
on L,(S). The latter involves a finitely additive set function E > h(E) defined on 
Borel sets of S and is given by a Radon-Stieltjes integral 


x*[f] = [ f(P) dh. (4.4.44) 


For further details see the literature. 
Let us observe that the case p = 2 is a special case of Theorem 2.5.11, which says 
that any linear bounded functional on a Hilbert space is given by an inner product 


x*[f] = (3 ἢ), (4.4.45) 


where ἢ is a fixed uniquely determined element of the space. Now L,(S) is a Hilbert 
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space with inner product 


ΓΞ [ f(P) KP) aP, (4.4.46) 


and this is equivalent to (4.4.43) for p = 2. 
In conclusion a remark concerning linear bounded transformations from 
L,(S) into itself. For m =1 a frequently encountered type is 


TUS \(t) = : K(s, t) f(s) ds, (4.4.47) 


where the kernel K satisfies suitable integrability conditions to ensure the 
integrability of the transform. These conditions are normally adjusted to the use of 
Holder’s inequality and the theorem of Fubini. Thus take p = 2 and suppose that 
K(s, 1) is measurable and square integrable over S x S. Set 


{ 1/2 
[ [KG 1? dst} = [Κ|. (4.4.48) 


Then (4.4.47) exists for almost all ¢ and defines a mapping from L,(S) into itself 
with norm ||T|| < ||K||.. The proof is left to the reader. 


EXERCISE 4.4 


1. A function ¢ — f(t) is defined on [0, 1] and takes on the three values 0, 1, 2, namely 0 
if ¢ is rational, 1 if ¢ is transcendental, and 2 if ¢ is algebraic but not rational. What 
is the L,,-norm? 


. Verify that convolution is a commutative product. 
. Verify (4.4.7). 

. Verify that the inclusion (4.4.14) is proper. 

. Prove (4.4.18) and discuss when equality holds. 


nN vA Bh ὦ KN 


. Given the function t > f(t) = ¢71/? [log (4t)]~? with 1 < p < οὐ. Consider 


max | F(t) dt, 
E 


where Ε is any interval of length h, 0 « h <1, located in the interval 0 < ¢ <1. 
Show that the maximum lies between constant multiples of h!/4 [log (“1.2 for small 
values of 4. What bearing does this result have on (4.4.19)? 


7. Complete the proof of Theorem 4.4.4. 

8. The series (4.4.32) converges uniformly on S. Does this imply continuity of its sum? 
9. Complete the proof of Theorem 4.4.5. 
10. Fill in missing details in the proof of Theorem 4.4.6. 
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11. 


17. 


18. 
19. 


20. 
21. 
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Let ¥ = 1(-- οὐ, ©) and define for h > 0 
f(t; ἢ) = (2h)! [_ re + 5) ds. 
Show that this ‘“‘moving average”’ is a continuous function of ¢ which belongs to 


L(— οὐ, ©) and its norm is <|| f||.. Actually f(t; A) converges to f(t) as A | 0, 
pointwise almost everywhere and in the L,-norm. 


. Prove the last clause of the preceding problem under the added assumption that ἢ 


itself is uniformly continuous for —0o < ft < οὐ. 


. Fill in missing details in the proof of Theorem 4.4.7. 


. Show that (4.4.41) defines a linear bounded functional on L,(S) if fe L,(S) and g is 


fixed in L,,(S). Find its norm. 


. Show that the same formula defines a linear bounded functional on L,(S) if fe L,, 


and g is fixed in L,. Find its norm. 


. Let x, be the characteristic function of the interval [n,n + 1), αι = 0, +1,242,.... 


Let x* be a linear bounded functional on L,(— 00, 00) and suppose that 
x*[y,] = a,. Prove that Σ |a,|? is convergent and 
g(t) = > a, χι(ἢ) € L,(— ©, 00). 
— 
Prove that 
00 
[ f(t) g(t) at 
is a linear bounded functional on L,(— 00, 00) which coincides with x* on the linear 
subspace spanned by the y,’s. Here 1 < p < o and I/p + Ἰ|ᾳ = 1. 


With the same notation as in the preceding problem, take p = 1 and determine what 
conditions {a,} and g(t) must satisfy. Show that a linear functional is defined on 
L,(— ©, oo) and find its norm. 


Carry out the same investigation for p = οὐ. 


Take 1 < p < o and show that the null space of the functional determined by g(7) 
contains every f such that {17} f(t) dt = 0 and f(t) = 0 outside of [k,k + 1]. 
Do the values of x* on the set {y,} determine x* uniquely? 


The functions x, form an orthogonal system in L,(— 00, 0). Can it be complete? 


Show that the mth Fourier coefficient of f 
1 7 re 
7, = = | f(s)e""™* ds, 
QTC) sg 


n integer or zero, defines a linear bounded functional on any space L,(— 7, =). Find 
a bound for this functional. 
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22. Take p = 1 and define products in L,(— 2, 2) by convolution. How is this done? 
Show that the Fourier coefficients are not merely linear functionals but also 
multiplicative, 1.6. x*{ f* g] = x*[f]x*[g]. A constant multiplier should figure 
before the integral defining the star product and the Fubini theorem is required. 


23. Verify the statements made in connection with the transformation (4.4.47). 


24. Modify the conditions on K if it be required that Tis a mapping from L,(S) into itself. 


4.5 REMARKS ON FOURIER SERIES 


With any Lebesgue space L,(a, δ) on the real line are associated orthonormal systems 
and corresponding expansions which may represent, in one sense or another, the 
elements of the space in question. At this juncture we restrict ourselves to classical 
Fourier series 


ΚΟ Σ hem (4.5.1) 
with 
1 [τ 
i,= oe [7 e "ds, (4.5.2) 


For the moment the sign of equivalence in (4.5.1) indicates merely that the 
coefficients f, are associated with a given function f in L,(—72, 2) by the formulas 
(4.5.2). Any other connection with f remains to be proved. 

Here the orthonormal system is 


{(2n)71/? ert}, (4.5.3) 


In this theory it is customary to suppose f continued with period 2z outside the 
interval (-- π, 7) and that f(—z) = f(z). This implies that in (4.5.2) the integral 
may be taken over any interval of length 2x. The Fourier coefficients f, are defined 
for fin any space L,(—7z, π) and form a bounded double sequence. Much more 
information will become available in the subsequent discussion. 

Theorem 4.4.6 applies to the present situation and shows that continuous 
functions are dense in L,(—7z, π) for any p with 1 < p < oo. Such a continuous 
function is supposed to satisfy the condition f(—72) = f(z) and be continued with 
period 2x. The notion of a Lebesgue p-modulus of continuity has to be modified 
slightly in this case since we cannot integrate a periodic function over the whole 
real line. Instead we set 

| 1/p 


u(hs f) = sup | "LF + 8) SOP at} (4.5.4) 
Is] Ξ ἢ -κπ 


and it is an easy matter to show that this function has all the properties stated in 
Theorem 4.4.7 for the function, denoted by the same symbol there but defined by 
formulas (4.4.35) and (4.4.36). 

We can utilize these moduli to get estimates for the Fourier coefficients. 
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Theorem 4.5.1. If n #0, we have for any p, 1 <p< o, 


71 < 42m)? (5...) (4.5.5) 


In|’ 


Proof. We make a change of variable in (4.5.2), replacing s by 5 + πίη. There is 
no need of changing the interval of integration. This gives 


1 [τ 
h=-x | f(s +7) ema, 
DIG.) ge n 
which, added to (4.5.2), gives 


Ι. Ὑἴ π : 
27, ΞΞ -- --- --} - ΝΣ ds. 
f= — 5 013) -ῷ πα 
Holder’s inequality together with the definition of the pth modulus gives (4.5.5). a 


Corollary (Theorem of Riemann—Lebesgue). We have 
lim f, = 0. (4.5.6) 


In| > 2 
Proof. This follows from the fact that for any p, lim y,(h; f) = 0. §f 
In the L,-case the Fourier coefficients have certain extremal properties, as may 


be expected from Theorem 2.5.6 when applied to § = L,(—72z,7) and the 
orthonormal system (4.5.3). 


Theorem 4.5.2. In L,(—7,2) the partial sums of the Fourier series (4.5.1) 
form a sequence of best approximations to f in the sense that for any n and any 


choice of 2n +1 numbers C_y,) C—n4its ++» COs «+s Ὁπ--1» Cn 
[νὼ - Σαρρα» |" Ufo)? ds = 28 SAP (4.5.7) 


with equality iff c, = ἔκ for all k. 


Proof. Translate (2.5.14) into L,-language, keeping in mind the normalization 
factor in (4.5.3). ἢ 


Corollary [Bessel’s inequality]. We have 
an Sil? < |” Lftsyl? as (4.5.8) 


Proof. This is (2.5.15) in the L,-case. Jj 


This shows, in particular, that the series in the left member of (4.4.21) is 
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convergent. Actually we have the equality 


an Yl? = Ι΄ ΤΟΣ (4.5.9) 


for all fe L,(—7z, x). This is known as the trigonometric closure relation or 
Parseval’s identity. 

The reader should not be shocked by the word “‘trigonometric’”’ in this 
connection. In classical analysis Fourier series were actually trigonometric series 


f(t) ~4a) + Σ [a, cosnt + 5, sinnt] (4.5.10) 
with ; 
1 π 
He --- ij f(s) be (ns) ds. (4.5.11) 


Using the formula 


n 


οἰ — cosnt + isinat, 
we can pass from (4.5.1) to (4.5.10), where for n > 0 
ὦ, = 3(α, — iby), Con = (Gy + ἰδ.) (4.5.12) 


We come now to the question of proving (4.5.9). Theorem 4.5.2 asserts that the 
partial sums of the Fourier series give the best L,-approximation to / by trigono- 
metric polynomials of a given degree n. Now if we can exhibit a sequence of 
trigonometric polynomials 


n 
T(t) = ; Ckon ek" 
=n 


for which the left member of (4.5.7) goes to zero as n — oo, then the right member 
must have the same property and (4.5.9) results. 

There are a large number of such polynomials available in the literature. The 
oldest and simplest example is due to the Hungarian mathematician Leopold Fejér 
(1880-1959), who in 1900 showed that the arithmetic means of the partial sums of 
the Fourier series of a continuous periodic function converge uniformly to the 
function for all 1. Later it was shown that they also have the property of mean 
convergence in L, and this is what is needed here. 

We start with the partial sums 


π 


Ss = > Ket={ 


D,(s)f(s + t) ds. (4.5.13) 


Here D,(s) is a trigonometric polynomial of degree n. It is known as the Dirichlet 
kernel [Peter Gustav Lejeune-Dirichlet (1805-59), German mathematician of 
Huguenot descent]. Formulas (4.5.13) and (4.5.2) show that 


2n D,(s) = 5 + coss + cos2s + +++ + cosns. (4.5.14) 
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Multiplying both sides by sin4s and using the trigonometric identity 
2 sinacos b = sin (a + δ) — sin (a — Bb), 

we obtain, finally, 

1 sin(a + 2)5 


4.5.15 
2π 51η3}5 ( ) 


D,(s) = 


The Dirichlet kernel is not of constant sign and the integral of its absolute value 
over a period is an unbounded function of n, in fact 


[ " 1D,(s)| ds = O[logn]. (4.5.16) 


It has been shown that the various convergency defects which plague the theory of 
Fourier series are ultimately due to (4.5.16). The arithmetic means of the partial 
sums have much better properties. We form 


] 
T(t; f) = ray [Sots sf) + Sif) +--+ SAG f)]. (4.5.17) 


This is again a trigonometric polynomial in ὦ of degree n, namely 
T,(t; f) = y (1 - aA efit (4.5.18) 
n : a n+ 1 


with the integral representation 
π 


Tats f) = | 


F.(s) f(s + t) ds, (4.5.19) 
where F,,(s) is the Fejér kernel, which is the arithmetic mean of the Dirichlet kernels. 
Using (4.5.15) and the trigonometric identity 

2 sina sin ὃ = cos (a — δ) — cos (a + 6), 


we find, after simplification, 


rosa jae (4.5.20) 


2n(n + 1) sin $5 

The properties of this kernel are much more favorable for our purposes than 
those of the Dirichlet kernel. The Fejér kernel is non-negative and this is the crux 
of the matter. We list the properties that will be used below: 


F,(s) 0, (4.5.21) 
| F(s)ds=1, νη, (4.5.22) 
lim F,(s)=0, 0. «{μ6] « π. (4.5.23) 


nw 
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Here (4.5.21) and (4.5.23) follow from (4.5.20), and in the last formula the limit 
holds uniformly in s provided |s| > 6, a fixed positive number. See Problem 9, 
Exercise 4.5. Formula (4.5.14) shows that 


| D,(s) ds = 1, Vn, (4.5.24) 


and this implies (4.5.22). 

We shall now prove that Τ᾿ ({; 7.) has the desired convergency properties. From 
this will follow (1) that the partial sums S,(t; f) also converge to f(t) in L, and 
(2) that the closure relation holds. 


Theorem 4.5.3. For fe L,(—17, 2) 


lim | |f(t) — T(t; f)|? dt = 0. (4.5.25) 


n> oo τ 


Proof. We have 


[Le - nanrae= ἡ “de 


where (4.5.22) has been used. Here we apply the Bounyakovsky—Schwarz 


inequality to the s-integral, using as the two factors in L, 


[F,(s)]'/* and [F,(s)]"/? [f(s + ἢ — οὶ 


fit) = |" Fy(s)fls + as 


2 
dt, 


[TFG Lis τὸ το] ds 


so that 


π π 


π 2 
[mie τὴ —solds]< [ πα. [πιο ῖλο +9 - fo as 
where the first factor is 1 by (4.5.22). Substituting and changing the order of 
integration (the Fubini theorem!) we obtain 


π 


[Fas { | res + t) — f(t)|? a ds <| 


F,(s)[ua(ls|sf)] ds (4.5.26) 


by (4.5.4) with p = 2. 
Here p2(|s|; f) is a continuous function which tends to zero with s by the 
analogue of Theorem 4.4.7. Let 


L(Is|;f)<M, Vs, and μ,γ([5};7) « ε, [5] « ὃ. 
It follows that (4.5.26) is dominated by 


ὃ π π 
26 | F(s) ds + 2M? | F(s)ds < e+ 2M? | Ε,(6) ds, 
0 5 ὃ 
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again by (4.5.22). On the other hand, (4.5.23) shows that the integral goes to 0 as 
Η -- oo since the integrand converges uniformly to 0. It follows that the superior 
limit of the integral in (4.5.25) does not exceed ¢. Since ε is arbitrary, (4.5.25) 
holds. Ε 


Corollary 1. The partial sums of the Fourier series of a function f € 1,χ(-- π, πὶ) 
converge to f in the sense of the metric, i.e. 


π 2 
lim dt = 0. (4.5.27) 


no Tn? 


fO)- > he 
Proof. This follows from the inequality 


[᾿ νὼ -- nesnPare [7 0 - se NP ae 
which is a special case of (4.5.7), together with Theorem 4.5.3. ii 


Corollary 2. Under the same assumptions 


| | f(s)|? ds = 2n 2 Ale. (4.5.28) 
Proof. For this is equivalent to (4.5.27). Jj 


Theorem 4.5.4. The closure relation (4.5.28) is equivalent to either of the 

following statements: 

1) The only {6 L,(—2, 2) which is orthogonal to each of the functions eX" is 
equivalent to 0. 

2) The linear combinations of the functions e““ are dense in L,(—1, 7). 


Proof. This is a special case of Theorem 2.5.8. fj 


Property (1) expresses that the orthogonal system e* is maximal, while 
property (2) is often expressed by saying that the set of elements e*" is fundamental 
in L,. In L, a system of elements is maximal iff it is fundamental. In other 
Lebesgue spaces the corresponding notions are not necessarily equivalent. 

The following is an important consequence of (4.5.28). 


Theorem 4.5.5. If feL,(—7, π)ὴ, σε 1.,{--π, 2), and if their Fourier 
coefficients are f, and g,, respectively, then 


[ f(s) g(s) ds = 2n Σ ΠΕ (4.5.29) 
the series being absolutely convergent. 


Proof. Write the analogue of (4.5.28) for af + Bg and equate the coefficients of 
af on the two sides of the equation. ἢ 
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We apply this result to the case where s > g(s) = y(s; I), the characteristic 
function of the interval J = (a, δ) «- [--π, a]. Here 


l 1 
Jo = — (6 -- a), In = -— αν n#Q0, (4.5.30) 
2π 2nin 
so that 
b 00 thn [ 
Ι! f(s) ds = fo(b -- a) + Σ᾽ γε: ("Ὁ — e""*), (4.5.31) 


where the prime after the summation sign indicates that n = 0 is to be omitted. 
Here the series is absolutely convergent. Hence we have proved 


Theorem 4.5.6. The Fourier series of any function f in L,(—1, π) may be 
integrated term by term between any two limits a and b, (a, b)e[—1, x). The 
result is absolutely convergent and the sum equals [Ὁ f(s) ds. 


The same result holds in L,, 1 < p < 2, and trivially for 2 < p. For p =1 the 
integrated series still converges to the integral but not necessarily absolutely. 

This result is really astounding since the Fourier series itself may diverge every- 
where. This and other facts proved above show that the association between the 
function and its Fourier series is much closer than what is implied by the formulas 
(4.5.2) from which we started. The series may diverge or converge to a sum different 
from the local value of the function. On the average the series converges to the 
function if fe L,. “Οη the average’’ here means in the sense of (4.5.27). We can 
evaluate 


{ Jie. “and Ι΄ δ δι 


by termwise integration of Fourier series regardless of divergence. 

Every fe L, gives rise to a sequence of Fourier coefficients which belongs to a 
space /, of bilateral infinite sequences. It is convenient to define the norm in the 
latter space by 


ICs = [20 ¥ Lae}? (4.5.32) 


In effect, this means replacing the ordinary Fourier coefficients by the expansion 
coefficients with respect to the orthonormal system (4.5.3). We have now a mapping 
T from L, into /, 

T:f-> {hf} (4.5.33) 


which is linear. Since 


IF = WABI (4.5.34) 


the mapping preserves length, i.e. it is an isometry. Moreover, Theorem 4.5.5 may 
be interpreted as stating that inner products are preserved. Here by definition the 
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left member of (4.5.29) is the inner product in L, while the right member is the 
inner product in /,. This also implies that orthogonality is preserved, since if one 
side of the equality is zero, so is the other side. 

It is clear that T is 1-1 since the zero elements of L, and of J, are in 
unique correspondence under T. It is natural to ask if T is “onto.” In other 
words: Is every element {f,} € /, the image under T of an element f of L,? 
The answer is in the affirmative and is known as the Riesz—Fischer theorem, found 
in 1907 independently by F. Riesz and the German mathematician Ernst Fischer 
(1875-1954). 


Theorem 4.5.7. If { f,} is an element of l,, then there exists an element f of L; 
whose Fourier coefficients coincide with { f,}. 


Proof. Cf. the proof of Theorem 3.5.4. Form the polynomials 


S,(t) = Σ fret (4.5.35) 


They form a Cauchy sequence in L,(—7, 7). Let f be the limit of this sequence. 
Then fe L, and for |k| <n 


1 [τ 
----- | S,(s) ὁ εἰ ds = fe. 
ΕΣ Σ 
As n— οὐ the left member of this relation tends to 
1 {7 A 
— fi(s)e “ds, 
2h Jae 
so the kth Fourier coefficient of f is indeed f,. ἢ 


As the last topic of this section we consider convolution of two elements in L; 
defined by 


1 π 
(Feat) = 5- [Λε -- 8) σο)αν (4.5.36) 


Cf. formula (4.4.6) for similar constructs. The integral exists for almost all ¢ and 
the product is symmetric in the two factors. 


Theorem 4.5.8. If f and g € L,(—n, 2), then f*g is continuous and periodic 
with period 2x. The Fourier series 


(Feat) = Σ Δι σ, ἐν (4.5.37) 


is absolutely convergent for all 1. 
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Proof. Here we have 
1 π 
20 J=x 


ἊΣ Fe ᾿- κι -- 5) (3) ds} dt 


1 [τ Lie ; 
On | e7" f(u) ἀμ. — | 6." g(s) ds =f, 9, 
2π - π 2π σπ 


This gives the Fourier series in (4.5.37). The series is absolutely convergent by 
Cauchy’s inequality, uniformly with respect to t. Hence the sum of the series is a 
continuous function of t; moreover, the series is the Fourier series of its sum (see 
Problem 2, Exercise 4.5). On the other hand, the series is the Fourier series of the 
convolution product. Thus the latter is a continuous function of ¢ and the series 
converges point-wise to (f*g)(t). It may be shown directly that this function is 
continuous. For if ἢ, « ¢,, then 


4n? |(f* g)(t2) — (Fe σ)ι) < 10] Ι΄ If -- 5) -- Κα -- 5)} ds 


and the integral is dominated by [μ5(12 — t,; £)]?, which goes to zero with t, — f,. 
Thus we see that the transformation T takes a convolution product in L, into 
a product of Fourier coefficients in /,. a 


EXERCISE 4.5 


1. Verify formulas (4.5.11) and (4.5.12). What is the corresponding form of (4.5.9)? 


2. Show that if a series °°, a,e“" is uniformly convergent for all ¢ and if the sum of the 
series is F(t), then the series is the Fourier series of F. 


3. If f and f’ are both in L,(—7, 2), what relations hold between their Fourier 
coefficients? Show that Σ k? | κι|2 converges. 


4. Wehave4(z — t)~ ΣΡ. U/Asinktfor0 < 2 < 22. Find y,(h; 7) and discuss the 
corresponding inequality (4.5.5). How good is it? 
5. If fis a periodic function in BV[— π, π] having Fourier coefficients f,, show that 


2\n| | fl < ΚΞ,[1]. 
Show that the estimate is the best possible of its kind. 

6. Let {n,} be a strictly increasing sequence of positive integers such that n,/k — oo. 
A series of the form) a, cos (7, 1) is known as a cosine gap series or lacunary Series. 
When is this the Fourier series of a function in L,? Does this throw some light on 
how fast the Fourier coefficients have to tend to zero? 

7. Prove Fejér’s theorem: If fis a continuous function of period 27, then the sequence 
{T,(t; f)} converges uniformly to f(t). 

8. Under the same assumptions, if m < f(t) < M, Vt, show that γι < T,(t;f) < M, 
Vt,Vn. 
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10. 
11. 


12. 


13. 


14. 


19. 
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. Prove (4.5.23) by showing that 


3n(n +1) 
[1 + (+ 1}}μ|17᾽ 


[Hint: Consider the interval (n + 1)|u] < π separately.] 


F(u) < lul< π, n>. 


Verify the expressions for D,(s) and F,(s). 
[Lebesgue] If f(¢; 4) = (2A) Bea ae 6 + s)ds where fe L,, show that f(t; Δ) is a 
continuous function of ¢ which converges to f(t) in mean square as ἢ 4 0. 


Show that . 
sin nh 


fsb) =fo + If, —e™, 
nh 


where the prime indicates that n = 0 is excluded. Show that the series is absolutely 
convergent. 
In (4.5.31) take a = 0, b = x. Show that the closure relation for this characteristic 


function y(t; [) simplifies to 


oO 
dx? — 42 |x| + $n? = ¥ n°? cos ax, —H<xX< 7. 
i 


Verify directly this identity. 


In the preceding problem, form the integral of the square of the left member and sum 
the series Xn~*. Verify the closure relation. 


Under what conditions does the equation f* f = g have a solution fe L,(— π, 2)? 
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5 METRIC SPACES AND 
FIXED POINT THEOREMS 


The beginning of this chapter is an elaboration and review of notions introduced in 
Chapter 2. This is followed by partial ordering and fixed point theorems. The 
latter form the hard core of the chapter. 

After the early topologists got through with what might be called the taxonomy 
of the discipline, attention was devoted to the nature of topological mapping. A 
basic question is whether or not such a mapping of a space into itself shifts all 
points, or, possibly, leaves one or more points invariant. Both possibilities occur. 
See Exercise 5.4 for examples. The earliest fixed point theorem is that of L. E. J. 
Brouwer who proved in 1912 that a continuous map T of the closed unit ball in R” 
has at least one fixed point, 1.6. a point x, such that Tx9 = xp. Brouwer’s fixed 
point theorem was used by G. D. Birkhoff and O. D. Kellogg in 1922 to prove 
existence theorems in the theory of differential equations. At the same time, 
S. Banach found a contraction fixed point theorem that rapidly manifested its 
importance. Thus in 1930 ΚΕ. Caccioppoli proved that the Birkhoff-Kellogg theorem 
could be derived from the Banach theorem. In the present chapter we shall prove 
some fixed point theorems of importance in analysis. 

There are seven sections in the chapter: Metric spaces; Partial ordering; 
Contraction mappings; Contraction fixed point theorems; Contractive mappings; 
The Volterra fixed point theorem; and Some applications. 


5.1 METRIC SPACES 


We refer back to Section 2.1 for the basic definitions. A metric space is an abstract 
space in which a notion of distance d(x, y) between any two elements x and y is 
defined subject to postulates (D,) to (D3). See Definition 2.1.2. In terms of 
distance we could define the concepts of open set, e-neighborhood, cluster point, 
closure, closed set, and being dense in the space. Definition 2.1.3 introduced con- 
vergence, convergence to a point, Cauchy sequence, and complete metric space. 
Metric defined by a norm in a linear vector space came in Definition 2.1.4. 

In Chapters 3 and 4 a number of metric spaces were introduced mostly with a 
normed metric, and stress was laid on the completeness of the space. The com- 
pleteness proofs varied with the space, though certain common features will have 
been apparent to the attentive reader. 

We do not have far to go to find a metric space which is not complete. We can 
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take Q, the space of all rational numbers, with the usual metric 


a(x, y) = |x — yl. 


A Cauchy sequence of rational numbers does not necessarily converge to a rational 
number, so Q is not complete. On the other hand, we can embed Q as a dense 
subspace, in the space Κ΄, where all Cauchy sequences, in particular the sequences 
which are in Q, converge to limits. In fact, one way of introducing the real number 
system, when the rationals are taken as known, is to consider all Cauchy sequences 
of rational numbers. Two such sequences are equivalent, say {x,} ~ {y,}, iff 

lim |x, — y,| = 0. 

n~> 0 
Equivalent Cauchy sequences form equivalence classes. The set of all equivalence 
classes defines an abstract space in which a metric may be introduced by setting 


A({Xn}s {»,}) -- lim Xn ΝΣ Yrl- 


This is a complete metric space isomorphic to Εὖ. 

If a metric space is not complete, it may be completed by embedding it densely 
in the set of equivalence classes formed by the Cauchy sequences in the manner 
exhibited above for the space Q. It will be assumed in the following that the metric 
spaces under consideration are complete. 

We like to add some further concepts taken over from real analysis. In 
Section 2.1 the concept of nowhere dense was defined for a metric space X. A set S 
in ἃ had this property if every open ball in X contains a ball having no points in 
common with S. A set is of the first category in the sense of Baire if it is the union of a 
finite or countably infinite set of subsets each of which is nowhere dense in X. A Set is 
of the.second category if it is not of the first. In R* the set Q is of the first category 
and so is the set of real algebraic numbers, while that of the real transcendental 
numbers is of the second category. 

The classical Bolzano—Weierstrass theorem asserts that any bounded infinite set 
in R' has at least one cluster point. The latter is an element of Κ΄, but need not be 
an element of the set. The same theorem holds for sets in C”. It need not hold in a 
general metric space. An instance of this was exhibited in Section 3.2, where it was 
shown that the powers of ¢ form an infinite bounded set in C[0, 1] without cluster 
point; a still simpler instance is furnished by the space /,. Here the unit vectors 
u, = ἰδ} form a bounded infinite sequence without cluster points since 
fu, — u,|| = 2 for all distinct k and /. 

The Heine-Borel theorem is also apt to go by the board in the transit from 
finite to infinite dimensional space. Let ¥ be a complete metric space and S a set 
in X. A system of open sets G, is a covering of S if each point of S belongs to at least 
one set G,. S is said to have the Borel property if every system of open sets {G,} 
which covers S contains a finite subsystem which also covers S. In Κ᾽ every closed 
bounded set has the Borel property and this fact is known as the Heine—Borel 
theorem. It is also true in C”. Such a set S is said to be compact. It is conditionally 
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compact if its closure is compact. It is sequentially compact if every sequence of 
points in S contains a subsequence which converges to a point in S. It may be 
shown that in a complete metric space a set S is sequentially compact if it is compact. 


The theorem of Arzela, Theorem 3.2.1, is an example of a compactness theorem 


holding in C[a, b]. 


EXERCISE 5.1 


. The metric in /, is defined in terms of the norm (3.1.11). Show that /, is a complete 


metric space under the normed topology. 


. Consider the set S in /, consisting of all vectors such that all but a finite number of 


the coordinates are zero. Show that S is dense in the space. 


3. Find an infinite subset of /, which is nowhere dense in the space. 


12. 


5.2 


. Find two vectors x and y in /, of norm 1 such that all points tx + (1 — fd)y, 


0 <¢ <1, are also of norm 1. In other words, show that the unit sphere in /, 
contains (a continuum of) line segments. 


. Let 5 be a family of vectors in /,, say v, = (v,,,), and suppose that there is a fixed 


vector x = (x,) in /, such that |v, ,| < |x,| for all a and n. Show that S is 
conditionally compact. 


. Let F = { f,} be a family of elements in L,(— 7, 2). Show that if F is conditionally 


compact, then its elements are uniformly bounded, i.e. there is a constant M such that 
IF < MLV a. 


. The family {e"*; n = 0, +1, +2, ...} is uniformly bounded in L,(—2,2). Show 


that it contains no convergent subsequences. 


. Is the result of Problem 5 true in ἌΝ 
. Show that the unit sphere in C[a, 6] also contains a continuum of line segments. 


ΠΕ. Fc Xand Eis of the second category in the complete metric space X, show 


that F is also of the second category. 


. The boundary of a set E in a metric space & is the intersection of its closure with the 


closure of its complement in ¥. If E is open, show that its boundary is a set of the 
first category. 


If a set E is nowhere dense in a metric space X, show that the interior of its closure is 
void, Int(E) = G. Is this true if the set is merely of the first category? 


PARTIAL ORDERING 


The real number system is not merely the prototype of all linear vector spaces; it is 
also the prototype of an ordered system. There exists in this system a binary order 
relation expressed by the symbol “‘a < b”’ (less than or equal to) or, equivalently, 
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“b > a’ with the following properties: 


(O,) For all a, a < a (reflexive). 
(O,) Ifa <b and b <a, then a = b (antisymmetric). 
(Ο3) Ifa<bandb<_c, then a < ὁ (transitive). 


The real number system is linearly ordered or totally ordered in the sense that 
for every pair of elements a and ὁ either a< ὁ or b <a. 

A notion of order can be introduced in much more general systems than the 
real numbers. 


Definition 5.2.1. A set S of elements a, b, ... is said to be partially ordered if 
there exists a binary relation a < b defined for certain pairs (a,b) in S for 
which (O,) to (O3) hold. 


Just as in the theory of distance significant results are obtainable even if the 
symmetry condition (D,) of page 40 is dropped, the anti-symmetry condition 
(O,) may be dropped, as shown in a recent investigation by Ih-Ching Hsii 
(University of New Mexico dissertation, 1969). We shall, however, find it 
convenient to invoke all three conditions. 

Some examples will clarify these notions. 


Example 1. Take S = C’[0, 1], the space of real-valued functions f defined and 
continuous in [0,1], and define 


f<g tomean f(t) < g(t) for all ¢ in [0,1]. (5.2.1) 


This is a meaningful partial ordering. It cannot be extended to a total ordering for 
there are elements of the space for which neither f < 4 nor g < fis true. On the 
other hand, for any fin δ, the set [g;f « g] is never void and the same is true for 
the set [g; 4 < f]. Condition (5.2.1) is equivalent to 


σ-- 3 9, (5.2.2) 


that is, g(t) -- f(t)  Ο for all 1. Such “‘positive’’ elements play a basic role in the 
theory of partially ordered function spaces. See further below. 


Example 2. S is the set of subsets of a given set E. Here E, < E, is defined to 
mean E, © E,, 1.6. ordering under set inclusion. Here for any pair of sets E, and E, 
the intersection Ε, ὦ E, may very well be void and an inclusion E, € E, is acci- 
dental. On the other hand, given any pair (E,, E,) in S, there exists at least one set 
E, containing them both, so that E, < E3, E, < ΕΞ. We can take E, = E, U Ep, 
for instance. 


Example 3. A partial ordering may be established in R” as follows. Let 


x = 6 ore ores, a γ. 5 (».1» }2: -..» Vn) 
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and define x < y. to mean that 
χ; ΚΦ)» J=Hl,2,...,m. (5.2.3) 


This is equivalent to saying that y — x shall be a positive vector, 1.6. one with non- 
negative coordinates. 


Definition 5.2.2. Let S be a partially ordered set. Ifa < candb < ¢, then c is 
called an upper bound for a and b. Moreover, if c « d whenever d is an upper 
bound of a and b, we call c the least upper bound or the supremum of a and b, in 
symbols 

c=sup(a,b) or c=avb. (5.2.4) 


This element of S is unique if its exists. This fact depends essentially upon 
(O,). Similarly, we define the greatest lower bound or infimum of a and b 


p=inf(a,b) or p=anb. (5.2.5) 


In all three cases listed above infima and suprema exist. Thus in Example 1 
we define 


c(t) = max [a(t), b(t) ], p(t) = min [a(t), b(t)]. (5.2.6) 


These are elements of C’[0, 1]. Here c(t) is an upper bound of a(t) and b(t) and 
clearly the least upper bound. Similarly, p(t) is seen to be the greatest lower bound. 
The other examples are left to the reader. 

In a partially ordered set S the subset T is said to have ὁ for an upper bound if 
t < bforallte T. A subset T may have a least upper bound, 1.6. an element ὁ of S 
which is an upper bound of T and such that if ¢ is any upper bound of 17, then 
b <t. Similarly for greatest lower bounds. Both are unique if they exist. 


Definition 5.2.3. An element m of a partially ordered set S is said to be maximal 
if x € S together with m < x implies x = m. 


For the existence of maximal elements we have the maximal principle of Max 
Zorn, one form of which is 


Zorn’s Lemma. Let S be a non-empty partially ordered set such that every 
totally ordered subset of S has an upper bound in S, then S contains at least one 


maximal element. 


It is known that Zorn’s Lemma is equivalent to Zermelo’s Axiom of Choice 
in set theory. 


Definition 5.2.4. Α lattice is a partially ordered set S any two elements of which 
have a least upper bound and a greatest lower bound. 


The examples listed above involve lattices. 
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To the analyst a more interesting situation arises if the set S is a linear vector 
space as well as partially ordered. Here the partial ordering should satisfy the 
requirements that the two transformations 


ΧΟ ΧΟ ἃ and x-ax, Vx, a>0, {5.2.7} 
are order preserving, 1.6. 
(O,) x < y impliesx + a<y+a _/for alla, 
(O;) x < y implies ax < ay for alla > 0. 


In partially ordered vector spaces where (O,) to (O;) are valid we can define 
positive elements: x is positive if 0 < x. The set of positive elements forms the 
positive cone X* of X. This is a proper cone in the sense of Problem 15, 
Exercise 1.2. 


The positive cone X* contains the zero element and is invariant under addition 
and multiplication by positive scalars. 


If, in addition, ¥ is a complete metric space, it is customary to require 
(O,) A convergent sequence of positive elements converges to a positive element. 


We shall see later that transformations of a linear partially ordered vector 
space into itself which map the positive cone into itself are particularly important 
for analysis. 


EXERCISE 5.2 


1. Prove the existence of infima and suprema in Example 2. 

. Do the same for Example 3. 

. Why are infima and suprema unique when they exist? 

. Verify that the sets figuring in Examples 1 to 3 are lattices. 


. Verify that the functions c(t) and p(t) of (5.2.6) have the stated properties. 


NO ra && ῳ WN 


. Let K(s, ἢ) be a positive continuous function on the unit square [0, 1] x [0, 1] and 


define 
1 


T(f \(t) =| K(s, t) f(s) ds. 

0 
This is a linear bounded transformation from C[0, 1] to itself. See (3.2.38) and 
following. Show that T is order preserving in the sense that f < g implies 
T[f] < T[g] and that the positive cone €*[0, 1] is mapped into itself. 


7. Formula (3.1.39) together with (3.1.40) defines a linear bounded transformation from 
/, into itself. When is this transformation order preserving? Here x < y means 
Xz < Vo V k, 
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8. A linear bounded functional on ἰ, is defined by (3.1.29). Under what conditions on 
the vector b €/, is the functional order preserving? 


9. When is the functional defined by (3.2.36) order preserving? 


5.3 CONTRACTION MAPPINGS 


Let ἃ and Y be complete metric spaces and let T be a mapping from X into Y). 
Then to every x € ἢ corresponds a unique y € 3), the image of x under the mapping 
T, and we write 

y = T(x). (5.3.1) 


The mapping is said to be bounded if there exists a fixed finite M such that for ΧΙ 
and x, belonging to X 


al T(x,), T(x.)] < Md(x,, x,). (5.3.2) 


Note that in this formula d stands for distance and may mean one thing in the left 
member and another in the right if ἢ 4 ἢ. 
A bounded transformation is necessarily continuous for (5.3.2) is a type of 
Lipschitz condition and hence implies continuity. 
A mapping T is onto if every point y of 3) is the image of at least one point of 
x. It is 1-1 if 
X; #X, implies T(x,) # T(x,). (5.3.3) 


Suppose now that 2 = X so that T is a mapping from X into itself. The 
condition for boundedness is still given by (5.2.2), but now distance means the 
same thing on both sides of the inequality. Suppose that T, and T, are two bounded 
transformations from X into itself. Just as in the linear case, we can then define 
products of transformations as follows. Set 


(1, T,)(x) = T,[T,(x)], (T,T,)(x) = T,[T,(x)]. (5.3.4) 


The products are well defined and normally Τί Τί # T,T,. The bounded operators 
from ἢ into itself form a semi-group S(X) under multiplication. Since multiplication 
may possibly be non-associative, we are using the term “‘semi-group” simply as a 
designation for a system where one operation may be performed on any ordered 
pair of elements, the result being again an element of the system. A semi-group may 
have a neutral element U such that 

UT =TU=T, VT. (5.3.5) 


In our case this is the identity mapping J, 


I(x) =x, Vx. (5.3.6) 


Mappings may have inverses. We say that a mapping S of X into itself is the 
inverse of the mapping T of X into itself if 


ST =TS =I. (5.3.7) 
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We note that if T ε S(X), then all the powers of T are well defined and also 
belong to S(X). We set 


T"(x) = T[T"*(x)], n=2,3,.... (5.3.8) 
The Law of Exponents is valid, so that 
f Ra Uae! ΓΟ, (5.3.9) 
Note that 
ST (5.3.10) 


If T~* (the S of (5.3.7)) should exist, then (5.3.8) to (5.3.10) hold for all integers, 
not merely the positive ones, with the convention that T° = J, T' = T. 

We are now going to specialize still further and consider bounded trans- 
formations where the bound M is <1. 


Definition 5.3.1. Consider a bounded mapping T of a complete metric space X 
into itself so that T € S(X). Then T is a contraction mapping in the wide sense if 


A(T(X,), T(K2)] < d(x, Xp). (5.3.11) 
T is a strict contraction if 

d[ T(x,), T(x2)] < kd(x,, x,) (5.3.12) 
where k is a fixed number,0 <k <1. Finally, T is contractive if 

d[T(x,), T(x2)] < d(x, X>). (5.3.13) 


In each case the inequality is to hold for allx, and x, in X and in (5.3.13) x, # Xp. 
We illustrate by some examples. 


Example 1. Take ἃ = C*[0,1] with the sup-norm and set 


TU f](t) = ΕἸ f(s)ds, θ«τ«|, (5.3.14) 


with a fixed positive k. If k =1, this is a contraction in the wide sense since 
obviously 


[1.7] -- Tlalll « ΚΙ. -- gl. (5.3.15) 


If 0 < k <1, this is a contraction in the strict sense. 


Example 2. We take X = {x;1 < x} with the Euclidean metric and define 
1 
T(x) =x + a (5.3.16) 


Here X is a complete metric space and 


Nake eel 
dT(x,), T(x2)] = ———_ |x, — X2| < [xy — X2| = d(x,, X2) 


X 1X2 
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since at least one of the numbers x, and x, may be taken to be >1. Thus this is a 
contractive mapping and it is easy to see that it cannot be a strict contraction. 


Example 3. We take 3 = C[0, co], the space of functions continuous for all 
t, 0<t<oo. Continuity at infinity means that f(t) tends to a finite limit as 
t— oo. The mapping is to be defined by the shift operator T(s), where 


Τ(5) 7]({) =f@+s), O<s. (5.3.17) 
This is a contraction in the wide sense since 
[Τ(6}}} =1 (5.3.18) 


for all 5. Note that f = | is left invariant by T(s). Here {T(s)} is a one-parameter 
family of contraction operators forming a semi-group in the sense that 


T(s + t) = T(s) T(t) = T(t) T(s). (5.3.19) 


In connection with Example 3 we shall discuss an inequality which goes back 
to Edmund Landau (1877-1938) who in 1913 showed that if f; f’, and f’’ belong 
to C[a, δ]. then 

IF? <4 FS. (5.3.20) 


In particular, the inequality holds for C[0, co] and the same inequality was later 
found to be valid in L,(0, oo) for any p >1. The intrinsic reason for these in- 
equalities was discovered in 1967 by R. R. Kallman and G.-C. Rota, who showed 
that Landau’s inequality is a special case of an inequality that holds for any con- 
traction semi-group of linear operators on a B-space X into itself. If T(s) is strongly 
continuous in a sense to be defined in Chapter 7 and if T(0) = I, then a linear, in 
general unbounded, operator A from X to 3 is defined by the condition 


A[x] = lim —- [7(A) — Too. (5.3.21) 
nyo A 


The operator A is known as the infinitesimal generator of the semi-group. Its 
domain of definition D[A] is dense in 3 and so are the domains of the iterates A”. 
For χε D[A’] Kallman and Rota proved that 


| A(x)I|* < 4χ| 4709}. (5.3.22) 


This holds for any contraction semi-group, in particular for the semi-group of 
shift operators (5.3.17). In the latter case A is the operation of differentiation with 
respect to ¢ and (5.3.20) results. The proof of (5.3.22) is based on Taylor’s theorem 
with remainder, a form of which is valid for semi-group operators as shown by 
FE. Hille and K. Yosida independently of each other in the 1940’s. As applied to the 
shift operator this argument reduces to the classical Taylor’s theorem. 


Theorem 5.3.1. Let X be one of the spaces C[0, 0] or L,(0, 00) and let f be an 
element of X for which f’ and f"” also belong to X. Then (5.3.20) holds where the 
norm is that of the space in question. 
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Proof. By Taylor’s theorem, fors>0, t> 0, 


f(tt+s) =f) + sf’) + { (s— u) f(t + μὴ du, 


whence 


l 5 
γῷ = e+) -fO]-— | 6 τ as" + wa 


Here all functions involved belong to X and so does the integral for all 5, since 


Ι (s— τὴ.’ (Ὁ εὐ ἀμ] < Fl [ ( -- εὐ ὧι = 242 71. 


This is trivial for ([Ὁ, co] and for L, it follows from the approximation of the 
integral by a Riemann sum, the summands of which are L,, plus Minkowski’s 
inequality. This argument gives 


2 
IF SUS + asl Fl (5.3.23) 


since we deal with contraction operators. Here s is arbitrary, s > 0. If || f’’|| = 0, 
we let 5 > οὐ and obtain || f’|| = 0 so that (5.3.20) 1s trivially true. If || £’’|| > 0, we 
minimize the right member of (5.3.23) by taking 


s= 21/7 As" *. 
Substitution of this value in (5.3.23) and simplification give (5.3.20). ᾿ 


EXERCISE 5.3 


1. If X = C*[0, a], when is 


T{f Kt) = [,¢ — s)f(s)ds, OS t<a, 
0 


a contraction? Find by inspection an element left invariant by T. 


2. In L,(0, 27) a linear transformation from L, to L, is determined by its action on the 
Fourier coefficients; if 


fo~ >d fe", τῶ > Fe", 
take 
Fi=i1,f, Vn, 


where {J,,} is a given sequence of complex numbers. What condition on the A, will 
make T (i) bounded, (ii) a contraction, (ili) contractive? 


3. With the same notation, suppose that {λ.} is a bounded sequence and determine the 
nullspace of T. When does [7] reduce to {0}? 


5.4 


10. 


11. 


12. 
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. What condition on the 4’s will ensure the existence of Τ * as an element of &(L;)? 


Find T~! when it exists. 


. Is there a fixed point in Example 2 above? 


. [J. B. Miller] With X as in Theorem 5.3.1, assume that f, f’, f’’, and 2" belong to ¥ 


and prove that (with associated non-optimal numerical constants) 


2 1] I | i 


vr < + 6 ; eit 
In < Ὁ i 161A 1.511} 


. [S. Kurepa] With X equal to one of the spaces C[— οὐ, 00] or L,(— οὐ, 00), assume 


that f and its derivatives of order <4 belong to 3. Show that 
Σ[Χ( +5) +f — 5s) = TOMS) 


is a family of contraction operators and use the Kallman—Rota device to prove the 
inequality | 


IF? « SIF FOL. 


. The following transformation is associated with the name of Emile Picard. Take ἃ 


as one of the spaces C[— οὐ, o0] or L,(— 00, 00) and form 


co 


T(s\Lf Mx) = 4s | exp[—s|x — ul] f(@) du, 0 « 5. 


Show that 7(s) is a linear contraction mapping from X into itself. 


. If -Ξ C[—o, ow] show that 


lim Tis fix) = f) 


5- 0 


uniformly on compact sets. 


With X as in the preceding problem show that ΤΑῚ f](x) satisfies the differential 
equation 


F"'(x) — s*F(x) = — s7f(x) 
and is the only solution of this equation in X. 
Prove that 
(s? — εὮ ΤΟΊ ΤΩ) 7] = s?T LS) — 7791S). 


Suppose that fis a measurable function of x such that 7(s)[ 77 exists for all s > 0. 
Suppose that 7(s)[ f] = ffor alls > 0. Show that f(x) = ax + ὁ for some choice of 
the constants a and b. When is such a function in (1) C[— οὐ, oo], (2) LA— 0, 00)? 


CONTRACTION FIXED POINT THEOREMS 


We refer to Definition 5.3.1 for the terminology. In the present section ‘“con- 
traction” shall mean “strict contraction.”’ We shall prove the contraction fixed 
point theorem of S. Banach. 
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Thus we are concerned with a complete metric space X which is mapped into 
itself by a transformation T such that 


d[T(x,), T(X2)] < kd(x,, x2) (5.4.1) 

for all x,, x, € X, where k is a fixed number, 0 «Κα <1. 
Theorem 5.4.1. If T is a contraction of & into itself in the sense of (5.4.1), then 
T has a unique fixed point, i.e. there exists one and only one point Xo € X such that 
T(X9) = Xo. (5.4.2) 


Proof. We start with an arbitrary point x, in X and form the successive transforms 
of x, by the powers of T, 


Kot eS TK HS 1 2 es (5.4.3) 
It will be proved first that {x,} is a Cauchy sequence in X. Form <n 
((Χ,.» Xn) S α(Χ,.» X41) + AX 1» Xm+2) +t Ὁ K-15. %,) (5.4.4) 
by the triangle inequality. Formula (5.4.1) shows that for any positive integer 
A(Xp, Xp+1) = ALT (X,-1), T(Xp)] < kd(xp_ 1, Xp), 
and by repeated use of this device, 
U(X, Xp41) < Κρ ἰα(χ,, Xp). 
Summing these inequalities from p = m to p = ἡ —1 we obtain 
| A(X Xn) < [1 + Κα πον + Κη 2] dy, Xa). 
Thus yma : 
A(Xmy Xn) S Τι μι ἄ(Χ,, Χ)). (5.4.5) 


Since this goes to zero as m —> οὐ, it is seen that {χ,} is a Cauchy sequence in X. 
Now 3 is a complete metric space, so we can aver the existence of 
lim X, = Xo. 


n>o 


From 
Xnt+1 = T(x,) 
and the continuity of T which is a consequence of the Lipschitz condition (5.4.1) 
it follows that 
Xo = lim x,4, = lim T(x,) = (tim x.) = T(Xo), (5.4.6) 


no no n-~> oO 


so that (5.4.2) holds. 
Furthermore, Χο is the only fixed point. For suppose that 


Yo= T(Yo), γοε 3. 
Then 


A(X, Yo) = ALT (Xo), T(¥o)] < kd(Xo, Yo) 
and this is possible iff yp = Xo. fj 
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As an illustration we may take Example 1 of the preceding section. Thus 
X = C*[0, 1] and 


TLS M(t) = k | fs) ds, O<t<l, 


where 0 < k <1. This is obviously a contraction, so there is a unique fixed point. 
On the other hand, by inspection the zero element is left invariant, so it must be the 
unique fixed point. This example is instructive inasmuch as the zero element is a 
fixed point for any value of k, not necessarily restricted to the interval (0, 1). 

Since the powers of T are used in the proof of Theorem 5.4.1, it would seem 
plausible that the contractive properties of the powers of T may play some role for 
the existence and uniqueness of fixed points. This is, indeed, the case. 


Theorem 5.4.2. Τ᾽ has a unique fixed point if one of the powers of T is a 
contraction. 


Proof. Suppose that T™ is a contraction. By the preceding theorem there is a 
unique point xg such that 
T™ (Xo) = Xo- (5.4.7) 


In the proof of Theorem 5.4.1 the fixed point was obtained as the limit of a Cauchy 
sequence {x,} where the first element x, is arbitrary. We apply this observation to 
the present case, taking x, = T(x,) and replacing T by T™ in the iterative process. 
This gives 

lim (T™)"[T(X9)] = Xo. (5.4.8) 


n> oo 


Since the powers of T commute, we can rewrite this as 


ΤΊ lim (T™)" Xo| = Xp. 


Here (T™)" X9 = Χο for all n, so we are left with 
T(Xo) = Χο. 


Thus Χο 15 also a fixed element under T itself. Moreover, it must be the only fixed 
point under 7, for any fixed point of T is also a fixed point of JT”, which by 
assumption 15 a contraction and has a unique fixed point. It follows that T has one 
and only one fixed point, namely Xo. ἢ 


This theorem is a powerful extension of the Banach contraction fixed point 
theorem since the contractive power of a bounded transformation often increases 
upon iteration. One of our standard examples illustrates this. We take X = C[a, b], 
0<b—a< οὐ and 


ΤΙ] = [ Ia. ἀξ ἢ (5.4.9) 
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Here ||T || = ὁ — aso that T is not a contraction if b — a > 1. On the other hand, 


ΤΆΤ = Ι (t — 5)5. 1.6) ds (5.4.10) 


=p 
with norm 1 


so that T” is a contraction for all large values of m. Since the zero element is 
obviously invariant, this is the only fixed point. 


EXERCISE 5.4 


1. Let ¥ = R' and define T(x) = 4(x + sin x). Show that Tis contractive. Are there 
any fixed points? 
2. Consider the transformation on R* defined by 
T(x, y) = (x, ay), a #0,1. 
Show that there are infinitely many fixed points. Is T contractive for any value of a? 
Use the Euclidean metric. 


3. [G. Szekeres] Given the transformation on ΚΖ defined by 
T(x,y) =[x + y,y — α + Υ)}Ἕ[Ἐ 


T is odd in the sense that 7(—x, —y) = —T(x, y). Show that (0, 0) is the only 
fixed point of 7. Show that T 2 has in addition the two fixed point (2, —4) and 
(—2, 4) which form a fixed point cycle of order two, i.e. the two fixed points of T? 
are permuted by 7. 


4. The surface S of a torus (anchor ring) is given by the parametric equations 
x =cosa[R+prcos pf], y =sina[R+prcos fp], z=rsin β, 


where 0 < r < R. Show that there are two families of continuous transformations 
of S into itself which are without fixed points. [Hint: What is the geometric meaning 
of the parameters ?] 


5. Take X = C[— οὐ, oo], let s > O and consider the Poisson transformation 
s [Ὁ f(t+ udu 
P t) = — oo 
910 "- Ι- ere: 
Show that P, maps X into itself and, in particular, maps the positive cone of X into 


itself. Show that P, is a contraction in the wide sense. There are infinitely many 
fixed points. Find some of them. 


6. Show that P[P,] = P,[P,] = P,+,, the semi-group property. 


7. Does the Picard transformation of Problem 8, Exercise 5.3, have any fixed points in 
C[— oo, co]? 
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The next four problems give examples of transformations T from C*[0, ὁ] into itself. 
Determine the nature of the transformation (contraction in the wide sense or in the strict 
sense or contractive mapping or neither one nor the other) and determine the fixed points. 
Let T[f}(t) be 


8. f(t) — log + ΧΩ]. 
9. 1 — exp[—/f(z)]. 
10. fo [..6}1Π2 ds. 
11. (OLS) as. 
12. Is the mapping defined by Problem 10 bounded in the sense of (5.3.1)? Show that there 


are infinitely many fixed points and that the corresponding curves fill out a region in 
the plane with one and only one curve through each interior point of the region. 


5.5 CONTRACTIVE MAPPINGS 


We have seen that a contractive mapping need not have a fixed point. Thus the 
transformation 
T(x) =x+x7! 


on ἃ = {x; 1 < x} with the Euclidean metric exhibits this phenomenon. It is 
necessary to impose additional conditions on the mapping. Such a condition was 
found in 1964 by M. Edelstein, who proved 


Theorem 5.5.1. Let T be a contractive mapping of a complete metric space X 
into itself and suppose that there is a point x, € X such that the sequence {T"(x,)} 
contains a subsequence converging to a point Xo € X, then Xq is a fixed point of T 
and it is unique. 


Proof. The proof requires three steps. We start by observing that if 
{n;}, 1; <;+1,j =1, 2,3, ..., is a sequence of integers such that 
x, = Τ (ΧΙ) > Xo (5.5.1) 

is a convergent subsequence, then d(x;,X 9) < ¢ for j > N,. Take i> N, and set 
m= Nia — Nj. Then 

d(T" (Xo), Xi+ il = d[T™(Xo), T(X;)] < d(X, x;) < ε, 
since 7 is also contractive. Hence 

dl T™(Xo), Xo] < d(T (Xo), X41] + A[% 41, ΧΗ < 28 


by the triangle inequality. It is true that here ¢ is arbitrary, but the choice of 8 will 
affect the choice of i and hence that of m. Asa result of the first step we see, however, 
that there exist integers m such that T’(x,) is arbitrarily close to xo. 

For the second step we fix ¢, choose i and m as above, and consider 


71" (Xo) = Yo- (5.5.2) 
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If it should turn out that yy = Xo, then we can pass directly to the final step. We 
now assume that yp # Χο and will show that this assumption leads to a contra- 
diction. Consider the mapping 


a(T(x), Ty)] _ 
ee ie f(x, y). (5.5.3) 


The quotient is defined and continuous as long as y # x, hence, in particular, for 
X =X ),Y = Yo. Since T is contractive, f(x, y) < 1 wherever it is defined. If now 
(Χο, Yo) = ᾳ <1, then we can find a ὃ > 0 and a number k, q < k <1, such that 


d[ T(x), T(y)] < kd(x,y) for d(x,x9) <6, dy, Yo) < ὃ. (5.5.4) 


(x,y) > 


In order to achieve this it is enough to choose 6 so small that 


IF, y) — "Χο; Yo)l < 20. — 4) 


in the neighborhood of (xo, yo) in question. This is possible since fis continuous. 
We have then also 


f(xy) <¢+40-g=30 +49) «1. 
We take k = 4(1 + 4). Since 
lim T"(x;) = Τό τὰ x;) = T™(Xo) = Yo; 


jra@ 
we see that 
A(x;, Xo) < ὃ, d(T(x;), Yol < ὃ, for J > N, 


some large positive integer. Hence 
d(T(x;), TT (x;)] < kd[x;, T"(x;)]. 
T being contractive, we have for any positive integer p 
d({T?(x,), T™*?(x,)] < d[T(x)), Το *(&))] < kd[x;, T"())]. 
In particular, for p = n;4, — ἢ; 
A[Xj41, T"(Xj41)] < kd[x;, T"(X))] 
for any j > N. Thus for : >, 
d[{x,, T"(x,)] < k" /d[x;, Tn(x;)]. (5.5.5) 


Here as r > οὐ, the left member goes to d(xXo, yo) while the right member goes to 
zero. This contradiction shows that y, cannot be distinct from xg. Hence we have 
Xo = Yo = T™(XQ), and this completes the second step. 

The third step is much simpler. We have now 


T™(X9) = Xo (5.5.6) 
for some positive integer m. Suppose that 


T (Xo) = Vo Χο. 
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Then 
T™(Vo) = T™T(Xo) = ΤΤ (Χο) = T(Xo) = Vo 
and 
A(X, Vo) = A[T™ (Xo), T™(Vo)] < (Xo, Vo) 


since T is contractive. Thus the assumption that vy ¥ Χο also leads to a contra- 
diction and we must have 
Yo = Xo; T(X9) = Xo 
and Xq is a fixed point of T. 
To prove uniqueness, we suppose that zy is a fixed point of T. Then 


A(X, Zo) = A[T(X9), T(Zo)] < d(Xo, Zo), 


again because T is contractive. This contradiction shows that z) = X, and there is 
one and only one fixed point. i 


There is an important corollary which we state as 


Theorem 5.5.2. If T is a contractive mapping of a complete metric space X into 

a compact subset of itself, then T has a unique fixed point xy and for every x ε X 

Xo = lim T"(x). (5.5.7) 

Proof. To see this, note that the range T(X) is sequentially compact. This means 

that for a fixed x in 3 the sequence {T"(x)} has at least one cluster point in X. 

Hence there is a convergent subsequence and by the preceding theorem T has a 

unique fixed point given by the limit of the convergent subsequence. Thus the 

original sequence can have only one cluster point. Hence (5.5.7) holds and the 
limit is independent of the choice of x in 3. JJ 


EXERCISE 5.5 


1.if X= 1x3 Ι « x} with the Euclidean metric and if T(x) = x + x7!, then for 
fixed x the sequence {T"(x)} is obviously increasing. Is it bounded? 


2. Take X = R* and T(x) = sinx. Show that T maps R! onto a compact subset. 
Show that T is contractive and find the fixed point. 


3. Take ἃ = R? with the maximum coordinate metric and, further, let T(x, y) = 
[2 arctan (x + y), 4 sin(x — y)] where the arc tangent -has its principal value. 
Is T(R?) compact? Show that T is contractive and find the fixed point. 


4. The optimal value of & in formula (5.5.4) depends upONn Xo, Yo and ὃ. To illustrate, 
use Problem 2 above and set xy = ἐπ, Yo = in, ὃ = τἷσπ. Show that k will not 
exceed cos (=1,7). What happens to k when Xg and yg approach the fixed point? 


5. The Poisson transformation defined in Problem 5, Exercise 5.4, maps L,(— 00, 00) 
into itself. Prove this and show that P, is contractive and has a unique fixed point. 
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5.6 
We 
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THE VOLTERRA FIXED POINT THEOREM 


use this designation for a theorem which underlies the work of Vito Volterra 


(1860-1940) on linear integral equations in the 1890’s. 


Pro 


Theorem 5.6.1. Let Ὁ be a B-space. Let το be a given element of X and let 
S € &(X) be such that 


14+ ¥ |S] =4 <0. (5.6.1) 


Ms 
sy 


Then the transformation T defined by 
T(x) = Z + S(x) (5.6.2) 


has a unique fixed point given by 


Xo Ξ- Zo + di S"(Zo). (5.6.3) 


of. Here again we start with an arbitrary element x, of ¥ and form the 


sequence {x,} where 


X= 1, WH 23s 


Thus 


and 


n—2 
Xy = Bo ἘΠΣ δ'420) + 5 "Χ0) 
k= 


ΙΧ...» — Xull = I] — S"72(,) + δ᾽ (20) + --- + 55 ΡΤ (Zo) + S"TP NCR) IL- 


This does not exceed 


"SS" Llzoll + xsl, 


n-1 


and by condition (5.6.1) this goes to zero as n> oo. Hence {x,} is a Cauchy 
sequence in the complete metric space X and its limit is 


Xp = lim x, = Z + 2 S"(Zo) 


n— © 


as asserted above. This is a fixed point of T since 


Χρ = lim x, = lim T(x,_-;) = 7 (tim χα] = ΤίΧυ). 


noo no 


If yo is a fixed point of T, then 


Xo — Yo = T(Xo) — T(Yo) = S(%o — Yo) ΞΞ “"" = (Χο — Yo) 


and this goes to 0 asm > 00. Hence yp = Xp and the fixed point is unique. q 
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The solution (5.6.3) is known as the Neumann series. It is named after the 
Carl Neumann (1832-1925) who did important work in potential theory and in 
algebraic functions and their integrals. The fixed point of (5.6.2) is the solution of 
the functional equation 


X = Z + S(x) (5.6.4) 
or 
[I — S](x) = Zp, (5.6.5) 
whence, formally, 
x = [J — S]~1(z,). (5.6.6) 


Here [J — S]~* is the inverse transformation of I — 5 or, in the notation of 
(2.4.24), R(1, S), the resolvent of S evaluated at Δ =1. The proof of Theorem 5.6.1] 
shows that these formal considerations make sense: the resolvent does exist for 
A =1 and is given by the Neumann series. First, the series (5.6.3) converges in 
norm since 


IIZoll + 2 IS"(Zo)Il < Allzo]| (5.6.7) 


by condition (5.6.1). Secondly, we have 


(I — 5) 2. + 2 S*e0)| = E + Σ s*| LT — S] (2) = 2 — S"*'(z9) > 25 


asn-> oo. This shows that the series (5.6.3) represents R(A, S)(Zo) for A = 1 and 
we see that the spectrum of S is confined to the open disk |2| <1. The spectral 
radius of S is <1 and no stronger assertion can be made without imposing further 
conditions on S. Actually the Neumann series is no novelty to us; its first occurrence 
in this treatise was in formula (1.5.38) for the resolvent of a matrix. 


Example I. We take our standard transformation with X = Cla, δ] and S defined 
by integration from a to t. Further, z) = g(t) is in CLa, δ]. Equation (5.6.4) now 
becomes 


F(t) = g(t) + iE (s)ds, a<t<b, (5.6.8) 
with the solution 
oc 1 t es 
KO = 90 + & Ty [Os ols) ds 
or ᾿ 
κὴ - σ( « [ exp (t — 5) g(s) ds. (5.6.9) 


In this case the spectral radius is Ὁ (why?) and 2 = 0 is the only spectral value. 
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EXERCISE 5.6 


1. Find the solution of the matrix vector equation 
y=z+Q-y, QemM,, Q? =O, Qr-tzo. 
Here ΖΗ € C” and the unique solution y € C” is requested. 


2. An apparently trivial application of Theorem 5.6.1 is to find the solution of 


F(t) = g(t) + pF) Γ(), 


where g and F are given elements of C[0, 1] and yp is a complex parameter. Get the 
formal solution and decide what values of ~ must be excluded in order that 
fe C0, 1]. 


3. Verify the assertion that o(S) = {0} in the case of (5.6.8). 


The next three problems deal with a special Fredholm integral equation [Ivar Fredholm 
(1866-1927)]: 


fit) Ξ 44) + Ἢ | K(t — s) f(s) ds. 
Tl - ὦ 


Here p is a complex parameter; the function g and the kernel K are given elements of 
1,,χί —fh, Tt), 


g(t) Poy 3 Tn pe K(u) ΕΥ̓͂ Σ Κ, ΩΣ 


This is a functional equation to which Theorem 5.6.1 applies for sufficiently small values of 
|u|. Here ἃ = L,(—7, 7), Z% = g(t), and 


sifu = £ [Κα -- 9 Fa. 


4. Show that μᾺΡ ft 
Sf Mt) = (=) Ι΄ K,(t — s) f(s) ds, 


where a 
Ku) ~ 2 (k,)Pe"™. 


Show that the series converges absolutely and uniformly when p > 1. 


5. If f(t; μὴ denotes the solution of the Fredholm equation, show that 


fo 6) 
k,, In nit 


OD OO Daag 


n 


where the series is absolutely and uniformly convergent with respect to ¢. What 
values of must be excluded in making this statement? 
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6. Show that the excluded values of μ are the reciprocals of the characteristic values of 
the kernel as an operator in L,, i.e. those values of 4 for which 


π 


1 
Ah(t) — = | K(t — s)h(s) ds = 0 
πα 


has a non-trivial solution in 1.,γ(-- π, 2). [Hint: Postulate a solution in terms of a 
_ Fourier series and determine the conditions on /.] 


5.7 SOME APPLICATIONS 


Asa matter of fact, all of Chapter 6 and most of Chapter 12 are given to applications 
of fixed point theorems to basic questions of analysis. In this section we consider 
some questions of a somewhat special nature. 


I. Volterra’s Equation 


It is appropriate to start with a Volterra equation, 


F(t) = g(t) + [ Ko t) f(s) ds. (5.7.1) 


To simplify matters, we restrict ourselves to a finite interval and continuous 
functions. Let g ε C[0, δ] and let K(s, t) be continuous for (s, t) in [0, 6] x [0, 5], 
where ||K|| = B. Then for fe C[0, b] 


SL/1(@) = [ K(s, ἢ) f(s) ds (5.7.2) 


belongs to C[0, b] and the norm of the transformation S is at most bB. To be able 
to apply Theorem 5.6.1 we must show that 

SS" | (5.7.3) 
converges. Now 


SEF (t) = [ K,(s, t) f(s) ds, (5.7.4) 


where K, is the nth iterated kernel. Compare Problem 4, Exercise 5.6, 
where, however, the interval of integration is fixed and the kernel is periodic. Here 


SLI) = ! Κα, ἢ [ xe δῶ au| Fe 


5 [ ru) | i) Κα, 1) Κα, 5) ds| a 


where the interchange of the order of integration is permitted since the functions 
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are continuous. This gives 


ἔ 
K,(s,t) = | K(s, u) K(u, t) du, (5.7.5) 
as is seen by permuting the variables in the preceding integral. In general, 


K,(s, t) = Ι K(s, u) K,,_ ,(u, t) du. - (5.7.6) 


From (5.7.5) we obtain 
[571 < :(ὑ8γ 


and by complete induction 


1 
[5 < - (6B). (5.7.7) 


Thus condition (5.7.3) is amply satisfied. Since the other conditions of 
Theorem 5.6.1 obviously hold, it follows that the transformation TL 7] =g + S [f] 
has a unique fixed point or, equivalently, the Volterra equation (5.7.1) has a unique 
solution. The solution is given by the Neumann series, which now becomes 


f(t) = g(t) + [ K(s, 1) g(s) ds, (5.7.8) 
where the so-called resolvent kernel is 
K(s,t) = a K,,(s, t) (5.7.9) 
with K,(s,t) = K(s, ἡ). 
The assumptions on f, g, and K may be relaxed. Thus we may replace C[0, 5] 


by L,(0, 5) and appeal to the Fubini theorem. 
In the special case 


K(s, t) = K(s)€ L(O, δ) ἡ C(O, δ] (5.7.10) 


the solution can be given explicitly. Here 


K,(s,t) = [ Ke) K(u) du = κῳ [κω du, 


K,(s,t) = κῷ [κω | [ Ko av| du = K()3| [κω ἢ 
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It follows that the resolvent kernel is given by 


K(s, t) = K(s) exp [κω au| (5.7.11) 


and 


S(t) = g(t) + Ἱ K(s) exp [κω au g(s) ds. (5.7.12) 


The special case K(s) =1 figured in (5.6.9): we note that g(t) =0 implies 
f(t) =0. It should be observed that the spectrum of this operator as well as the 
general Volterra operator reduces to {0}. Volterra developed an extensive theory of 
functions of composition in connection with his integral equation. 


II. Fredholm’s Equation 


A particular case of this equation was studied in Exercise 5.6 where, thanks to the 
special nature of the kernel, the solution as a function of the parameter p was 
obtained for all non-singular values of μ. The resulting series in Problem 5 is of a 
type familiar in analytic function theory as the Mittag—Leffler expansion of the 
meromorphic function f(t; 4) — g(t). In the case to be studied here, the solution is 
still a meromorphic function of as shown by Fredholm, but all we obtain in general 
is a power series for the function for small values of μ. 
We shall consider 


7(1) =g9(t) + μ Κα, t) f(s) ds. (5.7.13) 


Here geC[a,b] and K is continuous for (5,1) in the square [a,b] x [a,b]. 
We set 


SE /](t) = 4 K(s, t) f(s) ds. (5.7.14) 


This is a linear bounded transformation from (ἴα, δ] into itself and 
IS < [μ[(ὁ -- α) Β, if B=max|K(s, 1). (5.7.15) 
Thus the series Σ ||S"|| is certainly convergent for 
lul < [(6 — a)B]". (5.7.16) 
Hence the solution is of the form 
Hi) = a(t) + [κοι 15) 95) as (5.7.17) 


with the resolvent kernel 


K(s,t;4) = > K,(s,t) μ'. (5.7.18) 
n=] 
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The series converges at least for p satisfying (5.7.16) and again K,(s,¢) is an 
iterated kernel 


b 
K,(s, t) =| K(s,u) K,—,(u,t) du, n= 2,3,.... (5.7.19) 


For fixed (5, 1) the resolvent kernel is an analytic function of μ. It turns out 
that, just as in the special case considered in Exercise 5.6, the singularities of this 
function are fixed, independent of (s,7), and those in the finite plane, if any, are 
poles. The singularities are the reciprocals of the values of A for which the 
homogeneous equation 


Ah(t) = [ Ko t) h(s) ds (5.7.20) 


has a non-trivial solution in C[a,b]. For such a value, Ay say, the non- 
homogeneous equation 


roftt) = g(t) + | "K(s, Of(s) ds (5.7.21) 


does not possess a solution for an arbitrary choice of 4 ε C[a, b]. The situation is 
analogous to that holding for systems of linear equations in C”. Fredholm 
suspected this analogy to be valid and it served him as a guiding line in his work. 

The restriction to continuous functions is by no means necessary. If it is 
dropped, then we can consider Volterra’s equation as a special case of Fredholm’s, 
where the kernel is zero above the diagonal t = 5 and the upper limit of integration 
is taken as 1. 

To illustrate the theory, let us consider two special examples where explicit 
solutions are available. Let {w,(t)} be a complete orthonormal system for L,(a, b). 
To simplify matters, we assume @,(t) € C[a, δ] for all n and to be real and of 
uniformly bounded sup norm. Such systems exist. Form the kernel 


K(s,t) = . kK nOm(S) On(t), (5.7.22) 


where k,, 2 0, Vm, and the series - 
Σ hey (5.7.23) 
m=1 


is convergent. These assumptions imply that (5.7.22) 15 absolutely con- 
vergent, uniformly in (5, 1), and its sum is a continuous function of (5,1). Wecan 
now compute the iterated kernels and find that 


Κι, (5,1) = Σ (Kin)"Om(S) Om(t) (5.7.24) 


and the resolvent kernel becomes 


Ο) 


K(5, 1510) = YH Σ ἀρ, ) Ont) (5.7.25) 


n= 
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If now 
[μ| < inf (|k,,17~*), (5.7.26) 


then the double series is absolutely convergent, uniformly in (s, t), and the order of 
summation may be interchanged. This gives, after simplification, 


Κα, 15) = ee co,(s) (0) (6.7.27) 


m=1 a μκ, 


Denote the set 
{Kn ΤΥ Km #0, 1<m < oc} 


by Σ and suppose that μ has a positive distance from Σ. Then the series (5.7.27) 
is absolutely convergent, uniformly in (5, 1). For fixed (5, t) the resolvent kernel is a 
meromorphic function of μ᾽ with simple poles at the points of Σ. Note that Σ has 
no finite cluster points. We have obviously 


kn Om(t) = Ἰ K(s, t) w,,(s) ds, (5.7.28) 


that is, the k,, are all in the point spectrum (= set of characteristic values) of the 
corresponding operator. Cf. (5.7.20). 

Note that the partial fraction series (5.7.27) is valid for all u not in 2, while the 
power series (5.7.27) presupposes that (5.7.26) holds, and in this circular disk the 
expansions represent the same analytic function of μ. 

We can now form the solution (5.7.17). Here g shall belong to L,(a, δ) and 
have Fourier coefficients {g,} with respect to the system {q,}. Substitution of 
(5.7.27) and termwise integration finally yield 


Ink m 
1 — pk,,, 


where the series is absolutely convergent for all μ not in Σ and the convergence is 
uniform in ¢. Thus the non-homogeneous equation is solvable for every p not in & 
and the solution is given by (5.7.29). On the other hand, for p = (k,,)~* the non- 
homogeneous equation has a solution in L, iff the corresponding Fourier coefficient 
Gm Of g(t) is zero so that (5.7.29) remains valid after the indeterminate form has 
been suppressed. 

The kernel (5.7.22) is symmetric, i.e. K(s,t) = K(t,s). Such kernels were 
studied in great detail by Hilbert and his school during the first decade of this 
century after Fredholm’s results became known. Erhardt Schmidt, one of Hilbert’s 
pupils, proved that such a kernel necessarily has at least one characteristic value. 
The minimum number one is reached by the kernel 


Km Om 5) ὦ, ([). 


Let us consider briefly the non-symmetric kernel 


fu) = g(t) + ue onl, (5.7.29) 


K(s,t) = Σ: GSD) (5.7.30) 
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under the same assumptions on k,, and w,,. Here the situation is radically different. 
There are no characteristic values. The reader should note that the term “‘character- 
istic value’? means one thing in the theory of integral equations and something 
different in linear operator theory. The relation is simply ἃ = 1/u. Cf. comments 
to equations (5.7.20) and (5.7.21). 

We now obtain 


K,(s, t) = Σ᾽, KimKm+1 oe Km+n-1%m(S) Om 4 κ([) (5.7.31) 


and oo 00 
K(s, ἐ; μ) == 2 μ' Ds, kink +1 τος Kintn-1 Om(S) Omn+n(t). (5.7.32) 


Here the double series is absolutely convergent, uniformly in (5, 1), for any finite p 
so that K(s, ἐ; μ) is an entire function of μ. We shall not prove this and note merely 
the solution of the non-homogeneous equation, valid for all μ, 


S(t; w) ae g(t) + De μ' " Κι κει oF Kien 5G nOnctat)s (5.7.33) 


where, as above, the g,,’s are the Fourier coefficients of g in the system {,,}. 

Integral equations are of great importance in applied mathematics (potential 
theory, boundary value problems for differential equations which usually lead to 
kernels of type (5.7.22), etc.) and this theory was one of the forerunners of functional 
analysis. 


111. Infinitely Many Equations in Infinitely Many Unknowns 


The problem of solving an infinite system of linear equations presented itself quite 
early in classical analysis and in many different connections. An early instance 
occurs in the work of J. Fourier on heat conduction. The American astronomer 
G. W. Hill (1838-1914) encountered such systems in lunar theory in 1877 and 
solved these systems formally by infinite determinants. Henri Poincaré (1854— 
1912) in his revision of celestial mechanics, 1892-9, put this work on a firm basis 
and Helge von Koch (1870-1924) developed an elaborate theory of infinite 
determinants with applications to linear differential equations and other fields, 
starting in 1892. His work was basic for Fredholm’s investigations. Quadratic 
forms in infinitely many unknowns and the related linear equations formed a 
fundamental part in Hilbert’s work on integral equations. Here we shall take up a 
few cases of linear systems which may be studied with the aid of the tools developed 
in this chapter, 
We consider first the class IN, of infinite matrices = (a,,) with the norm 


|Al|.. = sup ΡΣ: lax] < οὐ, (5.7.34) 
7 ie 


which is a natural generalization of (1.4.43). We define the algebraic matrix 
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operations in the natural manner. The product is the only operation which needs 
justification. We have AB = (εκ) with 


00 
Cj = > Bim mk: 
m=1 
Hence 


DY lel < YY αλη! [διαὶ = 2 Lind Ye Ιδνα! < Alles [53]. 


k=1 m=1 
Thus A exists and is an element of Mi,,: moreover, 


IAB] oo « Allo Blo. (5.7.35) 


The reader should verify that Yt, is complete in the metric so that it is a B-algebra. 
The unit element is & = (6,,). 
We now consider the equation 


Ay — Ay =x, (5.7.36) 
where x = (x,) €/,,; that 1s, 
IX||.o = sup |x,;| < oo. (5.7.37) 
j 


A solution is desired in the same space. Here J is a complex parameter and 
formally the solution is given by the resolvent of A operating on x or 


y = RQ, A)x. (5.7.38) 
This is true whenever the resolvent makes sense, in particular for 
|A] > All .- (5.7.39) 


For such values the usual Neumann series will give the solution. 
The special case 


Qi = (5.7.40) 
0, j<k, 


is of some interest. It is the matrix C corresponding to Cesaro summability of 
order one, usually denoted by (C,1). Cf. Problem 14, Exercise 3.1. Here 
||Cl|,, = 1 so the resolvent converges for |A| > 1. It is possible to find the character- 
istic values and corresponding vectors explicitly. They are 


Ae a Vi (0). Wed, 24.3, cx; (5.7.41) 


respectively. The proof is left to the reader 

Another case in which the Neumann series applies is given by the matrix (a;,) 
with 
Aj, = Ο;εκ- τ" (5.7.42) 


J 
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Here c, > 0 and strictly decreasing to 0 as n > oo. Further, 
> Ge. (5.7.43) 
n=1 


Here || A||,, =1. Nevertheless, the solution of 
y — ty =x (5.7.44) 
is given by the absolutely convergent series 
Y=HxXtAx+ Ax4 0-4 AK 40, (5.7.45) 
This follows from the fact that || 4"|| actually goes exponentially to zero. We have 


|| A? || = ΟΣ DO AC narra <— 1 ca Cy + ὃν = «- Fe 
ij m 


Complete induction gives 
| A"|] < ren), (5.7.46) 


where [uw] is the largest integer <u. This proves the assertion on the norm of A" 
and also the validity of (5.7.45) as the solution of (5.7.44). 


EXERCISE 5.7 


1. Find the resolvent kernel K(s, ¢) of the Volterra equation with kernel (ἐ -- 5) %, 
a > —1. [Hint: Remember the β-[ποίοη of Euler!] 


2. Find the resolvent kernel K(s, ¢; 4) of the Fredholm equation with a = 0, 5 = 1, 
K(s, t) = t — 5. The power series can be summed in closed form. Show that it is a 
meromorphic function of y and find the poles. 


3. Verify (5.7.24) to (5.7.27). 
4. Verify (5.7.28) and justify (5.7.29). 


5. Are there complete orthonormal systems in L,(a, b) the elements of which are in 
Cla, δ] and of uniformly bounded sup norms? 


6. Verify (5.7.31) and prove the convergence of the series. 


7. Prove the absolute and uniform convergence of the series (5.7.32), assuming (5.7.23) 
absolutely convergent. 


8. If K(s, ¢) is defined by (5.7.30) with (5.7.23) holding and if 
a 


b 
ST fit) = Ϊ K(s, t) f(s) ds, fe (Ια, δ], 


what is the spectrum of the operator S? [Hint: Prove quasi-nilpotency.] 
9. Justify (5.7.33). 


5.7 


10. 


11. 
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If tn = μι * in (5.7.32), show that sup |K(s, ¢; μὴ < Aexp (Β|μ{142) for all μ᾿ and 
some constants A and B. (s,t) 


Suppose, instead, that μ,, = m “, « >1;show that the upper bound should be of 
the form A exp (B|y|!/%). Thus the resolvent kernel is an entire function of order at 
most 1|αὰ. For$ < « <1, the kernel K(s, 1) is still L, and all iterated kernels are in 
C[a, δ]. The estimate now applies to K(s, t; μ) — μΚῷ, t). Prove this. 


. Prove that "θὲ. is complete in the normed metric. 
. The matrix C of (5.7.40) has an inverse not in W,,. Find it! 


. Show that the spectrum and characteristic values of C are as stated in (5.7.41). 


Describe the matrix KC, C). 


. Verify (5.7.46). 


. In 1832 the German astronomer and mathematician August Ferdinand Md6bius 


(1790-1868) discovered an inversion formula of importance in analysis and number 
theory. It can be formulated in terms of two infinite matrices which are inverses of 
each other. Here A = (a;,), where ἄπ = 1 or 0 according as j divides k or not. The 
elements ὁ ik of 8 assume the values 0, +1, —1. They are expressible in terms of the 
MObius constants p(”). The latter is 0 if 1 is divisible by a square, y(1) = 1, and 
y(n) = (—1)* if πὶ is the product of k distinct primes. Show that ὁ jk = Ὁ unless 
k = jn, in which case διὲκ = p(n). [Note that this is merely a convenient way of 
formulating the facts. Mobius wrote before the days of matrices.] 
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6 EXISTENCE AND UNIQUENESS THEOREMS 


Analysts are sceptical by instinct, training, and habitude. One way in which this 
attitude shows itself is through the emphasis placed on existence and uniqueness 
theorems in analysis. Where the physicist places his trust in intuition and in the 
intrinsic simplicity of natural phenomena to help him over the mathematical 
difficulties adherent to his problems, the mathematician holds back until he has 
confirmed his guess by a proof. There is something to be said for both attitudes. 
Mathematicians are seldom prepared at the outset to render effective help to the 
physicist, who is thus left to do the pioneering. On the other hand, once the 
mathematician has become interested in the ideas involved, he can often go further 
than the physicist and investigate aspects of the problem where the physicist’s 
intuition is of little help. What will be developed in this chapter should be viewed 
against the background of alternating advances in mathematics and physics during 
the last 300 years. Most of what we are going to do are applications either of fixed 
point theorems or of the older method of successive approximations. 

There are four sections: The implicit function theorem; The method of 
successive approximations; Majorants; and Applications to ordinary differential 
equations. 


6.1 THE IMPLICIT FUNCTION THEOREM 


In analysis functions are often defined implicitly, and this leads to two basic prob- 
lems: (1) Is there a function that satisfies the conditions of the problem and, if so, is 
it unique? (2) Find a process that leads to a construction of the solution. Nowadays 
the second problem will presumably include a request for effective programming of 
the computer. 

We shall start with the simplest case. There is given a real-valued function of 
two real variables F(x, y) and a point (Xo, Yo), where 


F(Xo0, Yo) = 9. (6.1.1) 
Is it possible to solve the equation 
F(x, y) =90 (6.1.2) 


for y in terms of x, say y = f(x), in some neighborhood of x = x9, where 
FI (Xo) = Yo? If it is possible, is the solution unique? The geometrical interpretation 
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of the problem is simple. We are given a curve C in the (x, y)-plane of equation 
(6.1.2) and a point (x9, yo) on the curve. The example 


(x? + y?)? — x? —ay?=0, = (Xo, ¥o) = (0, 0), 


Shows that such a point may be an isolated point (for a > 0) or a double point with 
two real branches (a < 0). What we need is a condition that excludes multiple 
points as well as branches with a vertical tangent where no local representation of 
the form y = f(x) is possible. Such conditions can be found and we shall show how. 

We place (Xo, yo) at the origin and assume that F(x, y) is continuous and has 
continuous first order partials in some neighborhood of the origin. We express this 
simply by saying that F is continuously differentiable in the neighborhood in question. 


Theorem 6.1.1. Let F(x, y) be defined and continuously differentiable in some 
domain D of R? containing the origin. Suppose that (i) Ε(0,0) Ξ 0 
and (ii) Εν(Ο, 0). ζ 0. Then there exists an interval (—r, r) and a function 
f(x) such that (1) f(x) is continuous and differentiable in (—r, r), (2) f(0) = 0, 
(3) 


F[x, f(x)] = 0, —r<ex<y, (6.1.3) 
ec F,Lx,f@] 
fw=- Eee (6.1.4) 


Proof. Wecan give an elementary proof for the stated facts. Its only blemish is that 
it does not readily extend to more complicated situations. But we are concerned here 
with methods and one more will not be fatal. The idea of the proof is that in some 
Square centered at the origin with sides parallel to the axes the partial F,(x, y) 
keeps a constant sign so that F(x, y) is a monotone function of y for fixed x. If the 
side of the square is small, F(x, y) will be of constant sign on each of the horizontal 
boundaries, the sign on one being the opposite of that on the other. This means 
that on each vertical line segment of the square F(x, y) changes its sign once and 
only once. The value of y where this change takes place is the desired solution. 
The details follow. 

To fix the ideas, suppose that F,(0,0) >0. Then by the continuity of the 
partials F,(x, y) > 0 in some neighborhood of the origin contained in ἢ. In this 
neighborhood we can find a square, |x| < a, |y| < a, where F,(x, y) has a positive 
lower bound, say | 


Fi(x,y) => B> 0. (6.1.5) 
In the square F(x, y) exists and is bounded, say, 
|F(x, y)| < A. (6.1.6) 
We now define 
( (6.1.7) 
r=min{a,—a Al; 
A 


and consider the rectangle R: |x| <r, |y| <a. Since F(x, y) is monotone on that 
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part of the y-axis which lies in R, we have F(0,a) > 0, Ε(0, —a) <0. More 
precisely, by the mean value theorem 


F(0,a)=aF,(0,t)>aB, O<t<a, F(0,—a)< —aB. 
Now consider, for 0 < x <r, 
F(x, a) = F(0, a) + xF,(s,a) > aB~— rA>0 


by (6.1.7), where 0<s<r. Similarly, for 0 < x <r, 


F(x, —a) = Ε(0, —a) + xF,(s,, —a) < —aB + rA «0, 


where Ὁ < s, < x. These inequalities are also true for —r < x < 0. 

Thus F(x, y) is positive on the upper edge of the rectangle R, negative on the 
lower edge. Further, for fixed 5, —r « s < r, F(s, y) is an increasing function of y 
for —a<y<a. Thus there is one and only one value y = t where 


F(s, t) = 0. 
This defines ¢ uniquely as a function of 5, t = f(s). We have 
f(0)=0, ~=Ffs, f(s)] = 0, —r<s<r. 


It is fairly obvious that f(x) is continuous. The proof goes as follows and will 
give differentiability as a byproduct. Suppose that —r <x, <x, <r and set 
y1 = f(%1), ¥2 = f (x2). Then by the mean value theorem 


0 - Ἐ(χ,.}.) ΞΞ- F(X2, }2) 
ΞΞ F (x3, y3) (x, ἘΞ Χ4) Ῥ F, (x3, ys) (yy " )}2)» 


where (x3, ¥3) is a point on the line segment joining (x,, y,) with (x2, γω). Hence 


a Ε,( 3, y3) ma 
YI Ji > F,(x3, ¥3) (x2 -- x4) | (6.1.8) 
and A 
fe) — fel < S lx2 — x. (6.1.9) 


This Lipschitz condition implies continuity. But (6.1.8) also gives 


a F(x2) — αὶ) = 


. F(X, y 
li = lim F (x3, Ys) : 
X27X1 x2 —- χι X27%1 F,(x3, y3) 


Here x2 — x, implies x, > x1, y3 > y, = f(x,) and (6.1.4) results. Jj 


Various other methods are available to prove such facts. We shall consider the 
contraction fixed point theorem but apply it to a much more general situation, 
namely the existence of implicit functions in a Banach algebra 8 with unit element 
e. Commutativity of multiplication is not assumed. Elements are denoted by 
italics to avoid uglification. 
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Theorem 6.1.2. Let G(x, y) be a function from 8 x 8 to 8 defined on a domain 
®. Here D is taken as the “‘di-sphere’’: ||x|| < α, ||y|| < B, G(O,0) = 0, and 
G shall satisfy a Lipschitz condition 


G(x 1, ¥1) — G(X2, ya) < ALM. — all + ys — yall], (6.1.10) 
where k is fixed,0<k <1. Then the equation 
y=ax+ G(x, y), ae B, (6.1.11) 
has a unique solution, y = f(x), in 8, with f(O) = 0, defined for 
|x| << p< min (2. 8 ᾿ (6.1.12) 
lal +k 
The solution satisfies a Lipschitz condition 
| flo) — oe < LE κα, -- x. (6.1.13) 


Proof. We shall use the fixed point method. The reader will find another proof 
based on the method of successive approximations in the next section. For the first 
method we have to produce a complete metric space X and a contraction mapping 
from X into X such that the expected solution of (6.1.11) comes out as the fixed 
point of the mapping. Let X¥ be the family of functions g(x) satisfying the following 
conditions: 

1) g(x) is defined and continuous for ||x|| < p, defined by (6.1.12), and its range 

is in B. 

2) 9(0) = 0 and |\g(x)|| < B for ||x|| < p. 
This is the 8-norm but we have also to introduce a metric in X. For g,h € X we set 
d(g, h) = sup ||\g(x) — A(x)|| where the supremum refers to ||x|| < p. It is clear that 
X is a complete metric space. For if {g,} is a Cauchy sequence in X, then 
lim g,(x) = go(x) exists, is continuous, vanishes at x = 0 and ||go(x)|| < B for 


n— oo 
all x with ||x|| < p. Hence gy € X and X is complete under the normed metric. 
We now define T as the operator that takes g into 


T[g](x) = ax + GL[x, g(x)], gEX. (6.1.14) 


We have to show that T[g] exists and belongs to X. First T[g](x) is defined and 
continuous for ||x|| < p by virtue of the conditions imposed on g. Using (6.1.10) 
with x, = γχ = 0, x; = x, γι = g(x) we get 


IGLx, gO] « ALIlxl + llIg@)l] « k( + B) 
so that 


IT[g]@)|| < [α]} Ixll + ΚΟ + B) < [α]}ρ -Ὁ Κρ + BP =P 
by the choice of p. Hence T[g]¢X. Further, 
ΤΊ σ,10) — ΤΙσ270}}} = IGLx, σι ΟἹ] — GL, 4: }]! 
< αΠστοῦ -- ga(x)ll. (6.1.15) 
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Thus, in the metric of X defined above, 


a(T(g1), T(g2)] < Καῖσι, 91]. 


so that T is indeed a contraction. It follows that T has a unique fixed point in 3. 
Thus there exists one and only one function f(x) € ¥ such that 


f(x) = ax + G[x,f(x)], [{χ]] < p, (6.1.16) 
and this is the desired solution. The estimate (6.1.13) is obtained from (6.1.10) and 
I F%2) — Fy) I] < 1α]} Ye. — xy] + ΚΓ]|χ — wall + WS O2) — fe) 

is valid for all x,, x, under consideration. Jj 


The same technique applies to other problems involving implicit functions. 
We shall give two samples, omitting proofs. 


Theorem 6.1.3. Let aand x be vectors in C", a fixed, and let (a, x) denote their 
inner product. Let G(x,y) be a function from C" x C to C, defined for 
|x|] <a, |y| < B where it satisfies a Lipschitz condition 


IG(X2, ¥2) — Gy, »4}} < ΚΓΙΙΧ2 — yl + [7 — 11] (6.1.17) 
with k fixed,0<k <1. Further, G(0,0) =0. Then the equation 
y = (a,x) + G(x, y) (6.1.18) 
has a unique solution, y = f(x), for 
1—k 
|x|| < p < min 553) (6.1.19 
lal + k oo 


such that f(0) = 0 and 
J (x) = (a, x) + G[x, f(x)].. (6.1.20) 


Next we take the case where both x and y are vectors belonging to the same 
space C”. The norms for (" and the matrix space Wt, are arbitrary so long as 
| ex] < |All ΙΧ]. 


Theorem 6.1.4. Let AEM, be a given constant matrix. Let G(x, y) be a map- 
ping from C" x C" to C" defined for ||x|| < «, |ly|| < B, where it satisfies the 
Lipschitz condition 


Il G(X, γ2) — G(x, y,)Il < ALIx2 — Xa ll + lly2 — yall] (6.1.21) 
with 0 < k <1 and G(0,0) = 0. Then the equation 
y = Ax + G(x, y) (6.1.22) 
has a unique solution y = f(x) defined for 
l1—k 
|x| < p < min Corres 6.1.23 
p idl 2k β ( ) 


such that (0) = 0 and 
f(x) = Ax + G[x, f(x)]. (6.1.24) 
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The general problem of solving F(x, y) = 0 for y, where F is a function from 
Οἵ x Ο' to Οὗ, can often be reduced to the preceding case. Suppose that in a 
neighborhood of (0, 0) 
F(x, y) = By — Cx — H(x, y), (6.1.25) 
where & and C are constant matrices in Mt, and H(x,y) is continuously 
differentiable and 


(x, y)l| = οὔ [Χ|} + [1 (6.1.26) 


asx > 0,y > 0. If now Sis a regular matrix, we can multiply F on the left by B7* 
to obtain (6.1.22) with 
. A= 4.10, G=8'°'H. 


In the classical theory B would be the Jacobian matrix evaluated at the origin 
and we would get the familiar condition of solvability at the origin in terms of the 
non-vanishing of the Jacobian determinant. The extension to the case where F is a 
function on C” x C™ to C™ follows similar lines. 


EXERCISE 6.1 


1. Find a solution y = f(x) of 
we+yp+x—y=0 
in R such that (0) = 0. Find a suitable space ¥ of real-valued continuous functions 
g(x) and an interval (—r, r) such that 
| T: g(x) > x + x? + [g@oP 

is a contraction in X if x is restricted to (—r, r). 
. Solve the preceding problem when x and y are matrices in Mit, 
. Prove Theorem 6.1.3. 
. Prove Theorem 6.1.4. 
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. Carry through the discussion for (6.1.25) and prove that the solution has continuous 
first order partials at (0, 0). 


6. Show that (6.1.26) holds for 
H(x, y) = (a, x) y + (a, y) x, 
where a € C” is fixed and (a, x) is the inner product. 
7. Discuss the existence of partial derivatives of H(x, y) at (0, 0) when (6.1.26) holds. 
8. Does |(x, y)|!/2 (x + y) satisfy (6.1.26)? 


6.2 THE METHOD OF SUCCESSIVE APPROXIMATIONS 


Before fixed point theorems became available the method of successive approxi- 
mations was the standard tool used for the study of functional equations. This 
method was launched in 1890 by the eminent French mathematician Emile Picard 
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(1856-1941) who applied it to the theory of differential equations. Edouard 
Goursat (1858-1936), also a prominent French mathematician, adapted the method 
to the needs of the implicit function theorem in 1903. We shall use this method in a 


Second proof of Theorem 6.1.2. There is no difference in the assumptions made. 
Typical for the method is the successive definition of the members of a sequence of 
approximations { f,(x)} where, in the present case, 


fo(X) = ax, f(x) = ax + G[Lx, f,_10)] | (6.2.1) 


forn = 1,2,3,.... It is to be shown that these functions are well defined, continuous 
in the normed metric for x restricted by the condition (6.1.13). There is no question 
concerning /o(x). Suppose that all functions f,,(x) with m <n are in order. Thus - 
J, (x) exists, is continuous, and satisfies || f,(x)|| < B for the admitted values of x. 
We have then 


Sn+1(x) = ax + GLX, fr(x)]. 


Here the second term on the right exists and is a continuous function of x for 
|x| < p. Since G(0,0) = 0, the Lipschitz condition gives 


G(x, yl < ALI xl] + yd (6.2.2) 
in the domain of definition of G. Hence | 
I Fn+1OOll < lal xl + ΚΠ xl + ACO 
< [lal +k]oe+kB<U—k)p+kp=Bp 


by the choice of p. Hence f,,,(x) is also in order. | 
To prove convergence of the sequence { f,(x)} we again use the Lipschitz 
condition. We have now 


fn+1 0) —A,00 = GL. f,.0)] -- GLx,f,- 10) 
« ΚΙ P,Q) — fr-1 COI 


< Κι f(x) — fol). (6.2.3) 
It follows that 


Wari) τοὺ 


converges uniformly for ||x|| < p. Since 


n—-1 
IL) -- fa) «ΟΣ Wer) -- fk) +0 δ πε του, 
it follows that {f,(x)} is a Cauchy sequence. Hence | 
lim f(x) = f(x) (6.2.4) 


no 
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exists uniformly for ||x|| < p. Since f,(0) = 0, Vn, we have also f(0) = 0. Further, 
f(x) = lim f,41(%) = ax + lim GLx, f,%)] 


= ax + G[x, lim f,(x)] = ax + G[x, f(x)] 


by the continuity of G. Thus f(x) satisfies (6.1.16) for ||x|] < p. 
To prove uniqueness we again appeal to the Lipschitz condition. Let {f,(x)} 
be the sequence defined above and suppose that g(x) is such that σ(0) = 0 and 


g(x) = ax + GLx, g(x)] 
_ for ||x|] < po. Then for ||x|| < p, = min (p, po) the Lipschitz condition gives 
lg) — fra OOll = IGLx, g@)] -- GL f,.00]I < ΚΙ σ(χ) -- f,.0901. 


Hence, by induction, 


| g(x) — fer Ql < Κη σοὶ — ax, (6.2.5) 
and this goes to zero as ἢ > oo. Hence 


g(x) = lim f,(x) = f(x), 
and the solution is unique for ||x|| < p;. ἢ 
We can use Problem 1 of the preceding exercise as an illustration. 


Example 1. Find the solution y = f(x) of 
| y=xtrxty’, (0) Ξ 0, (6.2.6) 


in an interval containing x = 0. From the point of view of analytic geometry this is 
a cubic passing through the origin and we want the ordinate expressed in terms of 
the abscissa on the largest possible arc containing the point (0,0). A sketch of the 
‘graph will show that the curve tangent is vertical at the endpoints of the maximal 
arc. The analytical discussion will confirm this. 

We have here 


f(y=x, fi)d=x+2x9, σα) χε χ᾽ τ & - 2x°)’, 


in general 
fro) =xtx°4+ LAO)’. (6.2.7) 


This is a sequence of polynomials with positive integral coefficients involving only 
odd powers of x and f,(x) is of degree 3”. Further, 


Fas 1%) — 3) = Ti) -- fn- 109 
{LhiCP + fr-1O) Sa) + Ἦ [7...0 75}. (6.2.8) 


Since f,(x) — fo(x) = 2x7 > 0 for x > 0, complete induction shows that the 
sequence { f,(x)} is strictly increasing for x > 0. It will then converge to a finite 
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limit iff it is bounded. In (6.2.8) the quantity between the braces exceeds 


3L f,-1@)I’, 


and this quantity must be <1 if the sequence { f,(x)} is to be bounded. Thus a 
necessary condition for convergence is that there is an interval [0,r) where the 
condition 


f(x) <1V3, Va, (6.2.9) 
holds. Suppose that we have found an x and an n for which (6.2.9) holds. Then 
by (6.2.7) 

Irs) «χε χῆρ τ ῖν3 <1V3 
provided 

χε x—-2/3 <0. 


This says that x must be less than r, the positive real root of the equation 
e+4+x—2/3 =0, (6.2.10) 


which is approximately 0.34412. For 0 < x < r the inequality (6.2.9) holds for all n 
and 


lim f,(x) = f(x) | (6.2.11) 
fw=x4t+2x°4+ [fO)]’, " (6.2.12) 


and we have, of course, also f(0) = 0. 
Each function f,(x) is strictly increasing, and they are uniformly bounded in 
[0,r). Hence the same is true for f(x) and 


lim f(x) = yo (6.2.13) 
xtr 


exists. By (6.2.7) we have 


exists as a finite number. Further, 
γο τε τ τ τὴ + yo =3V3 + γοῦ, 


which is satisfied by yp = 1 3. This number, then, is the least upper bound of 
f(x) in [0,r]. If we set 
F(ix,yy=x-ytxesty’, 
then 
F,(x,y) = —1 + 3y’, 


and this is zero for y = + yo. Thus the cubic F = 0 has vertical tangents at the 
points (—r, —yo) and (r, yo). It follows that we have found the largest interval 
where a unique solution exists with f(0) = 0. The extension to the interval (—r, 0) 
is trivial since F(—x, —y) = —F(x,y). 

We have found the implicit function defined by (6.2.6) and the condition 
(0) = 0 and we have used the method of successive approximations. The reader 
may object, and with some justification, that Theorem 6.1.2 has not really been used. 
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There was no mention of a Lipschitz condition and no use was made of (6.1.12). 
It is easy to produce a Lipschitz condition. If G(x, y) = x° + γ᾽ we have 


IG(x2, 2) — GX, γ.}} < kL x2 — χα] + lyo — ill (6.2.14) 


provided we take α = $(3k)'/?, β = 4(3 k)’/”. We can now use (6.1.12), but the 
results obtainable in this manner are much less satisfactory than what we obtained 
using positivity and the monotonic properties of the approximations. 


Example 2. In Problem 2, Exercise 6.1, the question was to solve equation (6.2.6) 
in the matrix case where x, y € M,,. It is natural to ask to what extent the discussion 
of Example 1 carries over to the matrix case. We replace the italic letters x and y 
by the corresponding roman script letters % and Ἢ. We have now a sequence of 
matrix polynomials {¥,,(2)} with | 


F (L)=L, Fumo (L)=L+ L3 + [F,(L)). (6.2.15) 


᾿ς The numerical coefficients are positive integers and the degree of 7,,(L) is still 3”. 
Again, we determine r as the positive root of the equation (6.2.10) and find that 
||| < r implies 

|Fn(L)|| <4V3, Vm. (6.2.16) 


‘There is one case where the previously used argument carries over without 
further ado. We can introduce partial ordering in M,. defining a matrix ὅ = (p,,) 
as positive if each py, > 0. If we now take L = ὁ with ||| <r, then every F,,(5) 
is a positive matrix and 


FS) « 4,,..(6ὺ, Fm) <4Av3. 


If f,,;, denotes the entry at the place (j, k) in the matrix F,,(9), then for fixed j, Καὶ the 
numbers { f,,;,} form an increasing bounded sequence which thus tends to a limit 
fj, and (f;,) = F(S) is the desired limit of ¥,,(9). 

Suppose now that J = (x,,) is an arbitrary matrix with ||| <r. We set 
4 = (|x,,{) and take, as usual, 


n 
|| Al] = sup Σ, Ια κἰ. 
j k=1 


then | | 
FL = [4]. LY SS, 15, « NF (FYI (6.2.17) 


and also for k <1 
IF (L) — F, (DL) « 15.1.0} - “, Ὑ]. ᾿ (6.2.18) 


As we have just seen, the right member of this inequality goes to zero as k > oo. 
Hence so does the left member, so that {¥,,(0)} is a Cauchy sequence in the com- 
plete metric space M,. Then its limit is the desired solution. 
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This is in perfect analogy with the result for the scalar case. There is one basic 
difference, however, since lim ¥,,(%) may exist for some matrix J with |||] > 7, 
due to the fact that ||.L”|| may go to zero much faster than |||". An extreme case 15 
that in which J is a nilpotent matrix where now no limitation on ||| is required. 


EXERCISE 6.2 


. Give a proof of Theorem 6.1.3 using the method of successive approximations. 
. Do the same for Theorem 6.1.4. 

. Verify (6.2.14). 

. Show that 3” is the degree of f(x) in (6.2.7). 


. If f(x) is rearranged after increasing powers of x, it is observed that for a given k a 
certain number of terms at the beginning of the expansion of f(x) will be the same 
for all m > k. Thus for k > 0 every f,(x) starts out with x + 2x°. How far is it 
necessary to go with & for all the terms of degree < 10 to remain unchanged and what 
are these terms? 


6. Suppose that L is a nilpotent matrix in Mt, with n > 8, ὃ = Ὁ, L’ « ὁ. Find 
F(L). 


7. Let § be an idempotent matrix. For what values of the complex number « would 
§F(L) exist if G = af? 


8. Verify (6.2.17) and (6.2.18). 


Mm” Bh ὦ NO — 


6.3 MAJORANTS 


There are several valuable lessons to be drawn from the discussion of Examples 1 
and 2 in the preceding section, all concerned with the importance of positivity. 
The operation 


T[g](x) = x4+ x? + [g(x]? (6.3.1) 


applied to C[O, 1] is positive, i.e. it maps the positive cone into itself, and T is 
order preserving. These properties enabled us to reduce a discussion of convergence 
to a discussion of boundedness. In the matrix case the existence of lim ¥,,(0) for 
positive matrices 5 with ||S|| <r enabled us to prove convergence for general 
matrices of the same norm. 

In this section we shall give further applications of positivity, now for the case 
where F(x, y) is a power series in the two variables x and y. One more look at 
Example 1 will be instructive. It was observed in Problem 5, Exercise 6.2, that 
there exists a formal power series 


P(x) = De Cok+1 : le (6.3.2) 


with the following properties. Let P,(x) be the nth partial sum of the series. Then 
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there exists an integer m>n such that for p > m the terms of f,(x) of order 
<2n +1 coincide with P,(x). Since all coefficients are positive, we have for 
O<x<r 


P(X) < fin(X) «70... (6.3.3) 


This implies the convergence of the formal power series and the representation 
{0) =) Ope (6.3.4) 
k=0 


The coefficients of this power series may be obtained from the identity 
oo) oa) 3 
Σ See Sea oe Ὲ | > asia) (6.3.5) 
k=0 k=0 


by expanding the third power, collecting terms and equating coefficients of like 
powers of x. We are now ready to generalize. 

In the following the variables as well as the coefficients are complex-valued at 
the outset. We very soon reduce to positive variables and coefficients, however. 
We are given a function F(z, w) of two complex variables in a neighborhood of 
(0,0) and 

F(0, 0) = 0, F (0, 0) 4 0. (6.3.6) 


We shall assume the relation F(z, w) = O written in the form 


w= A102 “+ Σ Σ᾽ ajyziw*, (6.3.7) 
j ok 


where in the double series j > 0,k > 0,7 +k >1. Assuming that the double series 
converges for some non-trivial values of z and w, we are led to the following result. 


Theorem 6.3.1. Suppose that there exist three positive numbers M, s, t such that 


Ια 51 < M (6.3.8) 
for all j and k. Then there exists a unique power series 
we δ τη, (6.3.9) 
n=1 
convergent for ; 2 
<r= |—— 3.1 
[2] <r (—a) 5 (6.3.10) 
such that - “ὦ 
Σ, GZ" ΞΟ + DY Σ ἀμ Σ (6.3.11) 
n=1 gai κεὶ 


is an identity between absolutely convergent power series for [2] < r. 


Proof. . Assuming convergence of the series involved, we pose the identity (6.3.11) 
with a view of determining how the coefficients c, have to be chosen for the identity 
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to hold. If the series (6.3.9) has a positive radius of convergence, R say, then we can 
form the kth power of the series using Cauchy’s product theorem according to 
which the kth power can be written as a power series in z starting with a term in z* 
and absolutely convergent for |z]| < R. If now |z| « s and δ is sufficiently small, 
then the sum of the series (6.3.9) is less than ¢ in absolute value. For such values of z 
the triple series is absolutely convergent and may be rearranged in any way we 
please—for instance, as a power series in z. We have now an identity between two 
power series which can hold iff corresponding coefficients are equal. On the left 
the coefficient of z” is c,. On the right we get a number of terms involving z” and 
the sum of their coefficients must then also equal c,. 

In the right member a term in z” can arise only in a finite number of ways. 
For n = 1 there is only one term, namely a, Ζ, SOc, = @,;9. Forn > 1 we must have 


O<j<n, O<k<n, Llejtke<n. (6.3.12) 

There is no term in the right member with 7) = 0, A = 1. Hence the coefficient of z” 
in the right member involves no c, with p > n. This means that 

Cy = M,(Gju3 C1 C2 +09 Cn—1)- (6.3.13) 


Here M, is a multinomial in c,, 02; ..., C,-, and is linear in the a, where j and k are 
subject to the condition (6.3.12). Any numerical coefficient which occurs in M, is a 
positive integer. It follows that the coefficients c, may be computed successively in 
terms of the preceding coefficients. The first three relations under (6.3.13) read 


Cy = Aho, 

Cp Oyo. dey ἄρυδι (6.3.14) 
2 

C3 = 20 + gC, $+ ἀγχοῦ, + Ag3Cy> + ἀχὸ, + 2A g2C4Cp. 


The next problem is that of proving convergence of the formal series (6.3.9) 
for sufficiently small values of |z|. This will be achieved if we can prove the existence 
of a power series 


Σ᾽ C,2" (6.3.15) 
n=1 
with 
lc] <C,, Wn, (6.3.16) 


having a positive radius of convergence. Such a series is known as a majorant of the 
given series and the relation between them is expressed by the symbol <, thus 


1Ms 


00 
G2 <5 C2 (6.3.17) 
1 n=1 

This simple observation goes back to Cauchy, who developed what he called a 
Calcul de limites to prove convergence theorems for implicit functions and for 
solutions of differential equations. In this connection “‘/imites’’ means ‘“‘bounds’’ 
rather than “‘limits.”’ 


For the problem of finding a suitable majorant for the series (6.3.9) we take 
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another look at the recurrence relations (6.3.13) for the determination of the Ch: 
Suppose that in these relations we replace the a;, by positive numbers A,, such that 


ἰα κί < Ay, Vi, k. (6.3.18) 
Let us then determine C, from the relations 
Cc. = M, (Aj; Ca. seat Cs 2x) (6.3.19) 
forn =1,2,3,.... Then 
ler] = laiol < 440 = C, 
and the formulas (6.3.14) show that 
le2] < Cy, lez] < C3. 
The general inequality 
lel < Ὁ, 


now follows by complete induction. Here we have used the structure of the forms 
(6.3.13). All numerical coefficients are positive. This implies that the forms are 
positive for positive entries and are increasing functions of each of the arguments 
when the latter are positive. This type of positivity gives the inequality. 


Ι'Μ,(ακ: (1: (12:5... Cha) - M, (djl; ley|, [6], "5, Ic,— 4) 
Ξ τ δ Cy, Co, ...,C,-1); (6.3.20) 
if (6.3.18) holds. 
What we have done is to introduce an auxiliary equation 


w= ΑΖ ἘΣ Σ᾽ Ayziw', (6.3.21) 
jek 


where the right member is a majorant of the right member of (6.3.7). Equation 
(6.3.21) has the formal solution (6.3.15), where the coefficients are determined by 
(6.3.19). If it can be shown that the series (6.3.15) has a positive radius of con- 
vergence, R say, then it becomes an actual solution of (6.3.21). Moreover, since 
(6.3.15) is a majorant of (6.3.9), the latter series will also be convergent at least for 
|z| < R, and this implies that the formal series (6.3.9) is an actual solution of (6.3.7) 
in its domain of convergence. This observation may be stated roughly as: 
A majorant of the equation gives a majorant of the solution. 

We still have the problem of finding a suitable set of coefficients A je Such a set 
is furnished by condition (6.3.8). We can take 


Ay, = Ms~4t~*, VWi,k. (6.3.22) 
Now the corresponding series 
G(z, w) = Ajoz ἘΣ LA, z/w* (6.3.23) 


is seen to be absolutely convergent for |z| « 5, |w| < 1. It is essentially the product 
of two geometric series, one in z/s, the other in w/t. More precisely we have 


Gt) = M{(1 -2) "(1 ~~)" -1-*}. (6.3.24) 


t 
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Now the majorant equation 
w = G(z, w) (6.3.25) 


is a quadratic equation in w with coefficients which are rational functions of z, 5, ¢. 
We want the root of this equation which is zero for z = 0. It is given by 


1/2 
t? 12 \2 ΞΜ sz 
oe ΒΞ pee ee 6.3.26 
w 254M +{(—5) t+Ms-—z ( ) 


where the positive determination of the root is taken for z = 0. This expression is 
of the form 
w= A — B(r — 2)2 (5 — z) 112, (6.3.27) 


where r is defined by (6.3.10) and A and B are positive rational functions of M and ὦ 
of no importance for the following. Now w can be expanded in powers of z by 
expanding (r — z)'/? and (s — z)~*/? in binomial series in z/r and z/s, respectively. 
The first of these series converges absolutely for |z| <r, the second for |z| « 5. 
Here 0 <r<-s and the Cauchy product series converges absolutely for |z| < r. 
This says that the majorant series defined by (6.3.27) is absolutely convergent for 


[Ζ] <r. This series must coincide with the formal series 


oO 
ΣΟ", 
n=1 


where the C,’s are determined by (6.3.19) and the A,, by (6.3.22). From the con- 
vergence of the majorant series (6.3.15) for |z| <r we conclude that the formal 
series (6.3.9) for the solution of the original problem also converges for |z| < r. 
But this means that the various operations which were performed in order to find 
this series are legitimate and the formal solution is an actual one and clearly 


unique. Ε 


What we have obtained is essentially an existence proof. There exists a unique 
power series (6.3.9) which satisfies (6.3.11) in its domain of convergence, a circle of 
radius at least equal to r. The method does not determine the radius of con- 
vergence: all it gives is lower bounds for the radius in terms of the quantities M, s, 
and t. Normally these are not given in advance; we are given a power series (6.3.7) 
and from our knowledge of the coefficients numbers M, s, t have to be deduced. 
This can be done in infinitely many ways in general and there is a problem of 
optimization: What admissible choice of M, s, t will give the largest possible lower 
bound for r? Normally this problem cannot be solved and we are left with what is 
obtainable by cursory inspection. 

Let us look at our standard example 


w=z4¢22>4w’, 


Here a possible choice is obviously M = s = t = 1 since we deal with a polynomial 
and all coefficients are Ὁ or 1. This gives r = 4, less than one-third of the actual 
radius of convergence. 
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The majorant G(z, w) defined by (6.3.24) is that given by Cauchy. The choice 


Aj. = Ια κἰ» (6.3.28) 
which may claim to be the best choice, was used by Ernst Lindeldf (1870--1946) to 
prove existence theorems for implicit functions and for differential equations in 
1896-9. 


EXERCISE 6.3 


1. Verify (6.3.3). 


2. What recurrence relations are implied by (6.3.5)? Show that all coefficients are 
divisible by 6 except c, and c;3. 


3. Verify (6.3.14). 


4. The equation w = ζ΄ + w° has a unique power series solution which is 0 for z = 0. 
Show that all coefficients c, are zero except those for which ἡ = 3 + 12k, k = 0, 
1, 2, .... Find a lower bound for the radius of convergence. 


5. Find the solution of w = sin z sin w with w = 0 for z = 0. 


6. Show that the relation < is transitive and preserved under addition and 
multiplication by a positive number. 


7. Verify (6.3.26) and (6.3.27). 
8. Supply details omitted in the discussion of the series expansion for (6.3.27). 


9. Extend Theorem 6.3.1 to the matrix case. The a,, are to be scalars but z = 3 and 
w = ‘W are matrices in Mt,. On the whole the proof can be imitated, but there are 
some difficult points in connection with inverses and square roots. Assume 4 = J 
to be a positive matrix; carry out the argument for this case and then generalize. 


6.4 APPLICATIONS TO ORDINARY DIFFERENTIAL EQUATIONS 


In the preceding sections: we:have discussed several aspects of the implicit function 
theorem at some length. The main object was not to arrive at results by the shortest 
route but to describe alternate routes, to bring out their distinctive features, and to 
elucidate advantages and limitations. 

We have now various powerful techniques at our disposal and shall apply them 
to a brief study of ordinary differential equations. Here again Cauchy was the 
pioneer both in tiie real and in the complex domain. Let us first describe the 
problems in general terms. 

A first order differential equation is a:relation of the form 


» 0.) = FLx, y(x)] (6.4.1) 


y(Xo) = Yo- (6.4.2) 


with an. initial condition 
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As to the meaning to be attached to the symbols, we have a wide choice. In the 
simplest case x and y are real variables: F(x, y) is a function from R! x R' to ΚΕ, 
defined in some domain containing the point (x9, Vo). It is customary to require 
that F(x, y) be a continuous function of (x, y) in D. With such assumptions there 
is at least one solution y(x) defined in some interval (x5 — r, x) + r) such that 
(6.4.1) and (6.4.2) are satisfied. To ensure uniqueness of the solution, a Lipschitz 
condition 


[F (x, ¥1) — F(x, y2)| < Klyy — yal (6.4.3) 


is usually added. Here K is any positive number, not necessarily between 0 and 1. 

A more general situation is that where F(x, y) is a function from R! x R™ to 
ἈΠ or from C* x C™ to C™ defined and continuous in some domain D, say 
[x — χρί <a, lly — yoll < δ. Again a Lipschitz condition 


F(x, yi) — F(x, γ2}} « Κι ν, — yall (6.4.4) 


will ensure uniqueness of the solution of 


y'(x) = Flx,y(x)], γίχο) = Yo. (6.4.5) 


There are other possibilities. We may take ¥ (x, Y) asa function from Κ᾽ x M, 
to Mt, defined and continuous for |x — x9| <a, [Ὁ — Yoll < δ with a Lipschitz 
condition 


IF (x, Uy) — F(x, 32}} < KY, — 82] (6.4.6) 
to ensure uniqueness of the solution of 
Y'(x) = Fx, YX) ], Y(Xo) = 80. (6.4.7) 


The case where the matrix algebra is replaced by a general B-algebra is not 
essentially different. 

We could also assume F(x, y) to be a function from R! x ἃ to ¥ where X is a 
B-space. Here we run into difficulties with the meaning to be assigned to y’(x), the 
derivative of y(x) with respect to x. As we shail see in the next chapter, in a B-space 
there are weak derivatives as well as strong derivatives and other interpretations 
are also occasionally available. We shall not elaborate further in this direction. 

On the other hand, attention should be paid to the case 


w'(z) = y 


of] 8 


Az! we (6.4.8) 


where, in analogy with the implicit function case, we may expect the solution which 
is Ὁ for 2 = 0 to be representable by a power series in z convergent for small values 
of |z|. 

This ends the survey. We shall complete it by proving some of the results 
indicated, taking care to exhibit the various methods at our disposal. We start with 
problem (6.4.5), where to simplify we take (xo, yo) = (0,0). We shall use the 
contraction fixed point theorem. 
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Theorem 6.4.1. Let F(x, y) be a function from C! x C™ to C™ defined and 
continuous in the domain D: |x| < a, lly|| < 6. Further, the Lipschitz condition 
(6.4.4) shall hold and 


[F@, y)ll <M (6.4.9) 
in D. Then problem (6.4.5) has a unique solution defined for 
|x] <r< min (« a z) . (6.4.10) 
ΜΚ 


Proof. We have to construct a complete metric space X and a contraction mapping 
from X into itself such that the existing unique fixed point gives the solution of 
(6.4.5). To this end, let ¥ be the set of all functions g(x) defined and continuous in 
the interval [—r,r], having values in C”, and such that g(0) = 0 and ||g(x)|| < ὃ 
for all x in the interval. As distance in ἃ we take 


d{g, h] = sup ||g(x) — h(>)|. (6.4.11) 


It may be shown that X is complete in this metric. Define 


T[g](x) = Ϊ “ELs, g(s)] ds, (6.4.12) 


where, as usual, the integral of a vector is taken as the vector whose components 
are the integrals of the components of the integrand. Since the range of g(s) belongs 
to the domain of definition of the integrand, the latter exists and is a continuous 
function of s. Hence the integral exists and is a continuous function of x in [—r, r]. 
Further, 

ITLg]@)l| < M|x| < Mr «ὃ 


by the choice of r. We have also T[g](0) = 0 so that T maps X into itself. It is to 
be shown that T is a contraction. Now 


IT 19.0.) — Tlo2]@)ll = | [8 ts.9.600 — Εἰ, 4}}} | 


< [ IF Ls, 9, (s)] - ΕἸ 5, ᾳγ(5)} | ds 


< K < Kr d[g;, 2] 


Ι |9.(5) — 4,(6)} ds 


so that | 
4{T [91], T{g2]} < Krd[g,, 921. (6.4.13). 


Since Kr is a fixed positive number <1, the mapping is a contraction. Hence there 
is a unique fixed point of T, i.e. there exists an f(x) ε ¥ such that 


f(x) = [Ἐπ f(s)] ds. (6.4.14) 
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We have clearly [(0) = 0. The right member is differentiable with respect to the 
upper limit x of the integral since the integrand is continuous. Hence f’(x) 
exists and 


f’(x) = FLx, f(x)]. 


This is the desired solution and it is unique in ¥. It should be observed that a 
solution of problem (6.4.5) must satisfy the integral equation (6.4.14) for small 
values of |x|. Under the stated assumptions on F(x, y), if a solution exists, it must 
belong to the space X. The proof shows that ¥ contains a solution, one and only 
one solution. i 


The value of r is subjected to the condition Kr < 1. This undesirable restriction 
could be eliminated by showing that some power of T is a contraction regardless of 
the value of K. This condition does not appear when the method of successive 
approximations is used. 

We shall now apply the latter to the matrix case in order to vary the object of 
investigation and the mode of approach. 


Theorem 6.4.2. Let ¥(x,¥Y) be a function from C1 x M, to M,, defined and 
continuous in D: |x| < a, 1 }} « δ. Further, ¥ satisfies the Lipschitz condition 
(6.4.6) and is hase in D 


F(x, Y)|| « M. (6.4.15) 


Then the equation (6.4.7) with x9 = 0, Yo = Ὁ has a unique solution in Wi, for 
Ix] < r< min (a 4 (6.4.16) 
in{a,—). 4, 
M 
Proof. We define a sequence of approximating matrices {¥,,(x)}: 
U(x)=9, YX) = | F Ts, Y,,_1(s)] ds. (6.4.17) 
0 


By definition, the derivative of a matrix is the matrix of the derivatives of the 
entries and the integral of a matrix is the matrix of the integrals of the entries. It 
should be shown that the approximations exist, are continuous, and satisfy 
[Ὁ,(Χ}}} < ὁ for |x| <r. Suppose that the first k approximations have been found 
to be in order. Then ¥[s, Y,(s)] is defined as a continuous matrix function for 
[5] <r; Y,41(%) exists and is continuous and for |x| <r 


9... }} < 


Ϊ ΖΓ, Ῥι6}1} ds| « Με <b 
ο 


by the choice of r. Hence all approximations are in order. 
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To prove convergence of the sequence {4,,(x)} we resort to the Lipschitz 
condition. We have, for 0 < |x| <r, 


᾿5,,(χ) — Ym —1 OI] < 


Ι 1515, Yp—a(8)] -- FES, Yq 20} ds 


<K 


a 1... 10) — Yp-a(s)ll ds 


Suppose that it be known that for some integer k 


k 
|Y.(x) — Y,-4(x)|] < ΜΚ 1 ae (6.4.18) 
This is certainly true fork =1. Then 
K* Ix] ix 
YexiG) — UG «ΜῈ: ] sds = MK ΤΕ τ 


This shows that (6.4.18) is true for all values of k. Hence 
YW) -- Yn 10)! 


converges uniformly in the interval [—r,r] and 
lim Y,,(x) = Y(x) 
exists. Since 
lim Ὁ,,(Χ) = lim ae Ὁ. 099] ds = 1: lim Yi) ds 


it follows that Ξ 
U(x) = Ϊ Fs, U(s)] ds, YO)=90. 
0 


Again we can differentiate with respect to the upper limit and obtain 
U(x) = Fx, U(x)] 


as desired. A uniqueness proof may be based on the Lipschitz condition following 
the lines of the uniqueness proof in Section 6.2. The details are left to the reader. 
Actually a uniqueness proof based on a Lipschitz condition goes back to the fact 
that the functional inequality 


g(x) <K [/a¢s) as (6.4.19) 


has in-C*[0,r] the unique solution g(x) = 0. Such questions will be explored in 
Chapter 12. Jj 
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Our last item in this section is the discussion of equation (6.4.8), which calls for 
the use of majorants. 


Theorem 6.4.3. Let the coefficients in the right member of (6.4.8) satisfy the 
condition 
lax < Με" (6.4.20) 


for some choice of positive numbers Μ, 5,1. Then the differential equation (6.4.8) 
has a unique solution 


w(z) = δ C2" | (6.4.21) 


where the series is absolutely convergent for 


Ι2] Φἔ ξε 5 [ — exp (- an) : (6.4.22) 


Proof. We know the general pattern to follow from the discussion in Section 6.3. 
The power series (6.4.21) is substituted in (6.4.8) to obtain 


00 oO 00 9) k 
Σ᾿ ne,z" 1 = 2 Σ αμΖ᾽ Σ. ἢ ᾿ (6.4.23) 
a= j= = n= 


Here the kth power is expanded in a power series starting with a term in z*. If 
(6.4.21) has a positive radius of convergence R, then the kth powers give series with 
the same radius of convergence. For |z| < 5 and sufficiently small, the sum of the 
solution series is less than ¢ in absolute value; the triple series is absolutely con- 
vergent and may be rearranged ad /ib. If it be rewritten as a power series in z, the 
two power series on the left and on the right must be identical. This implies that the 
coefficients of z” on the two sides must be equal. 

Now on the left the constant term is c,, on the right do9, SO C; = ἄρο. For 
n > 0 the admissible values of 7 and k satisfy 


O<j<n, O0<Sk<n, 1<jtke<n. (6.4.24) 


Since there may be a term dayp,w in the right member of (6.4.8), the coefficient of z” 
may involve c, but no ὦ, with p > n can occur. Hence 


(1 +1) yay = PalGjy3 C15 C2, ....» Cy); (6.4.25) 


where P,, is a multinomial in c, to c,, linear in the a,, where j and k are subject to 
(6.4.24). All numerical coefficients are positive integers. This means that the c,’s 
may be computed consecutively from the preceding coefficients, i.e. ultimately in 
terms of c,. The first four relations are 


Ic, = Ago, 

20) = Ayo t+ o1C4, 

303 = G29 +4440, +g 9Cy" +9 1Co, 

404 = 39 +4210, +44 20,7 +493C1° +2 ἀρ.) +44 102 +49 1C3. (6.4.26) 
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The series (6.4.21) will converge if a convergent majorant series 
Σ ΟΣ NGS Cae VE (6.4.27) 
n=1 


can be found. We construct such a series by finding a formal solution of a 
majorant equation 
w’ ΞΞ » 7 A j,Z ‘ lajx |< A jx, Vi, Κ. (6.4.28) 
j=90 
That such a series is a majorant of the solution is shown by an analysis of the 
multinomials P,. Again we are concerned with the implications of positivity. 
The conditions (6.4.20) suggest taking 


Ax = Με} Bu Vi, k. (6.4.29) 


J 
Then the series 


H(z, w) = x pil A j4z)w* (6.4.30) 


is absolutely convergent for |z| < 5, |w| < t. Here H(z, w) is M times the product 
of two geometric series or in closed form 


z\71 w\71 
and the differential equation 
w’ = H(z, w) 
is simply —1 
(: 2 “) ee ( - Ξ) (6.4.32) 


which is integrable by elementary methods. With w(0) = 0 we get 


si nS (1 =| (6.4.33) 
w—-—wWw = — ——}. A, 
2t 6 5 


The logarithm is uniquely determined by its initial value 0 for z=0. This is a 
quadratic equation in w; there are two roots, only one of which is 0 for z = 0, 
namely 


1/2 
2Ms Ζ 
w= ἐστ} + Tog (1 -=)| ; (6.4.34) 


This solution may be expanded in powers of z and the resulting series is the desired 
majorant series for (6.4.21). The expansion may be made in two steps. We first 
expand the square root in terms of powers of 


2M 
Ζ-- 5" log (1 = =) (6.4.35) 
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The series converges for |Z| <1. The largest value of |Z| is for z = x where 
Ὁ <x «5. A simple computation shows that the least upper bound for x is r, the 
quantity defined by (6.4.22). For [Ζ] <r < s we can now expand the logarithm in 
powers of z and then collect terms. The result is the majorant series which is 
absolutely convergent for |z| <r. It follows that (6.4.21) is also absolutely con- 
vergent for [Ζ] < rand in this circular disk it is a solution of (6.4.8), the only solution 
of this equation which is 0 for z = 0. ἢ 


The radius of convergence of the majorant series is uniquely determined by the 
discussion, but that of the actual solution is not and we have the task of finding 
values of M, 5, t which maximize the lower bound. We shall not devote any time to 
this task. 


EXERCISE 6.4 


1. Show that a power of the transformation T defined by (6.4.12) is a contraction. 

2. Show that Theorem 6.4.2 remains true if matrices are replaced by the elements of a 
B-algebra with unit element. 

3. Prove Theorem 6.4.1 by the method of successive approxiniations. 

4. Solve the equation Y(t) = ὃ + [896}}]2, Yo) = Ὁ. explicitly in Wt, and find the 
largest interval [—a, a] in the interior of which U(r) is continuous. 

5. Write out a detailed proof of convergence of the series expansion for (6.4.35) follow- 
ing the lines indicated in the text. 


6. If the reader has had a course in analytic function theory, he would argue instead that 
the majorant series converges in the largest disk in which the function (6.4.34) is 
holomorphic. Give details! 


7. The equation w’ = z* + w? has a power series solution which is 0 for x = 0. Show 
that the coefficients c, are zero unless n = 3 + 4k, k = 0, 1, 2, .... Find a lower 
bound for the radius of convergence. 


8. Try to work out a convergence proof replacing the Cauchy majorant H(z, w) by the 
Lindelof one where Aj, = |a,,|, V J, k. 


9. In the proof of Theorem 6.4.1 show that ζ is complete under the metric defined by 
(6.4.11). 
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7 ΒΕΑΙ. ANALYSIS IN LINEAR SPACES* 


There are several good reasons why an analyst should take an interest in linear 
spaces. The functions that he considers often form a linear space which may facili- 
tate the discussion. 

On a different plane, it is striking to what extent the classical properties of 
functions of a real or a complex variable remain meaningful in a more general 
setting. We may mention such properties as continuity, differentiability, measur- 
ability, integrability, and analyticity. This is the type of analysis that we have in 
mind here. There are several levels of study. We have to start by modifying the 
classical definition of the basic properties to adjust to the new situation. The new 
concepts should be studied at some length. The classical theory can serve as a 
pattern, but we must be prepared for deviations. 

As usual, there are three types of mapping. Let ἃ and Y) be B-spaces, say over 
the complex field C. We have then to consider (1) mappings from R or C into ἃ, 
(2) mappings from 3 into R or C, and (3) mappings of ¥ into ἢ). All three types will 
occur in this chapter, but the emphasis is on functions of a real variable. Complex 
variables are treated at length in Chapter 8. The mappings involve a concept of 
continuity, in fact, there are several such concepts depending upon the underlying 
topology. We have a similar variety for the other concepts mentioned above. Thus, 
inter alia, we speak of weak, strong, and uniform continuity for operator functions 
Τί) € ©(X, Y), where s is a real or complex variable. In all these cases a stronger 
property implies a weaker one, but the converse may hold and, in fact, does hold for 
analyticity. In all such situations the principle of uniform boundedness plays a basic 
role and will be discussed in this chapter. We also have to pay some attention to 
alternate topologies. The discussion of the Bochner integral is tied up with abstract 
measure spaces (δ, Α, μὴ) and requires a considerable expansion of the notions 
presented in Chapter 4, a review of which is recommended before the reader tackles 
Section 7.5. 

There are five sections: The principle of uniform boundedness; Topologies; 
Vector-valued and operator-valued functions; Abstract Riemann-Stieltjes integrals; 
and Bochner integrals. 


7.1 THE PRINCIPLE OF UNIFORM BOUNDEDNESS 
We start with a result for real functionals on a complex B-space X. The functionals 
need not be linear, nor do they have to be defined everywhere in X. They should be 


* The author is indebted to Dr. Ih-Ching Hsii for a revision of this chapter. 
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defined and locally continuous on a subset X, which is not too thinly dispersed in 
X. Baire second category is a suitable assumption for X). In this subset some 
finiteness properties are also required. 


Theorem 7.1.1. Let X, be a set of the second category in a complex B-space 3. 
Suppose that { f,(x); « € A} is a family of continuous real functionals on X such 
that for each x € Xp 


sup f,(x) < 00. (7.1.1) 


Then in ¥ there is a closed sphere S, = {x; χε X, ||x — xol| < p} and a finite 
number B such that - 
f(x) < B, VxeS,, Vaed. (7.1.2) 
Proof. Let n be a positive integer and consider the set 
| Xo, = {X; XEX, f(x) <n, VaeA}. (7.1.3) 
For each ἡ, Xo, = {x; x € 3, f,(x) <n, Vae A} ὦ Xo, and is therefore closed in 3, 
by the continuity of the f.’s. For sufficiently large n, Xo, is not empty. Further, 
Xo — Ur Xo, Ξ Xo. If for each n the set 30, were a set of the first category in X, 
then Us, 30,» as a countable union of sets of the first category, would also be of 
the first category in ¥. By assumption, X, is of the second category in X, hence 
there exists a positive integer m such that X,,, is of the second category in ¥ and 


Int (Xo,,) = Int Kom) « Κ΄. 


This says that Xp,, has interior points and hence contains a sphere. Thus under the 
norm concerned, Xo,, contains a closed sphere, say GS, = {x; χε X, ||x — xo|| <p} 
in which (7.1.2) holds with B = μι. ἢ 


For the properties of Baire categories used in this proof, see Section 5.1 and 
especially Problems 10-12 of Exercise 5.1. 


Corollary 1. If the family { f,(x); α Ε A} satisfies 
inf f,(x) > — © (7.1.4) 


for each x on a Set X, of the second category in X, then there is a finite number B 
and a closed sphere S, in X such that 


FAX) > B, VxeS,, Vaed. (7.1.5) 
For {—f,(x); «€ A} satisfies the assumptions of Theorem 7.1.1. 


Corollary 2. If the family { f,(x); « € A} satisfies 
sup | f,(x)| < oo (7.1.6) 


for each x in a set X_ of the second category in X, then there is a finite number C 
and a closed sphere © in X such that 


A(X) << C, VxeS, Vaed. 
For {| f,(x)|; «€ A} satisfies the assumptions of Theorem 7.1.1. 
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It is desired to extend the conclusions from a sphere to all of ¥. This requires 
further restrictions on the functionals. Here subadditivity of the functionals together 
with a bound on f,(—x) in terms of f(x) will provide what is needed. 


Theorem 7.1.2. Let X, be a set of the second category in a complex B-space X. 
Suppose that a family F = { f,(x); « € A} of continuous real functionals satisfies 
(7.1.6) for each x in X9. Suppose further that F satisfies 


Ax + y) « Αι) + fly), VaeA, VxeX, VyeX, (7.1.7) 
and that there exists a constant C such that 
|f.(—x)| < CI/A.C&)|, VaeA, VxeX. (7.1.8) 


Then there are two numbers r and M such that 


M 
091 < max (m,™ jxt) (7.19) 
for all « and all x. 


Proof. By the preceding theorem and its corollaries there exist a sphere 
S = {x; ||x — x|| < p} and a constant B such that 


If£,(x)| < B, VxeS, VaeAd. 
By condition (7.1.8) | f,(x)| is then also bounded in the sphere 
- = {x; [Ix + xoll <p}. 

If x) = 0, the two spheres coincide and we omit the next step in the proof. If 
Χορ # 0 form the difference set 

Y= S-S={z;3z=x-y,xeS, ye S}. (7.1.10) 
This set contains the origin; it is closed, connected and convex. From the 
inequalities 

fix-y)<fA@+fl(-y), f(x) <f.(x — y) Ὁ ΛΟ), 
which are implied by (7.1.7), we infer that there is a finite M such that 
lf] <M, VaeA, Vxe Xp. (7.1.11) 


From this basis we proceed to exhaust X, leaning heavily on the subadditivity. 
Consider the set 425 = {u; 28 ε Σοὶ and the “shell” 2, = Σὸ O42. Form 
now the homothetic images of £,, say Xj, Z2,..., Ση»..., Where Σ, = {v;v/ne x, }. 
The union of 42, with J, 2, exhausts ¥. Since XZ, contains the sphere ||x|| < p, 
it follows that Σ, is bounded away from the origin and there exists an r > 4p such 
that 
inf |z| =r (7.1.12) 


for ΖΕ X,. Suppose that ΧΕ Z, and x = py where ye Σ,. Then 
FAX) = fal PY) = PLY) < pM (7.1.13) 
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by (7.1.7) and (7.1.11). Further, 
fy) Ξὰχ --- - Dy] 
< fx) +f.1-@ -- Dy] 
<f.(x) + ᾧ - DFA-Y) 


so that 
fAx) = fly) -- ῳ -DAA-Y) 

and | 

fx) - 
Hence 

[Ὁ] <pM, VaeA, Vxed,. 
Here _ isl a κα | 
ly Ὁ 

so that 

[fa(x)| < — = Ix! 


for all x outside of $2, where we have instead 


[f.(x)| < M 
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These two inequalities give (7.1.9). If x) = 0, we simply replace 2, by S and set 


r=p.fJ 


In the preceding theorem the functionals f,(x) are supposed to be finite in a set 
X, of the second category. Properties (7.1.7) and (7.1.8) then force each f,(x) to 
be finite valued everywhere and, in fact, uniformly bounded on compact sets. For 
this reason the theorem and its variants are referred to as the uniform boundedness 
principle. The particular case where the f,’s are linear bounded functionals leads to 


the following assertion. 


Theorem 7.1.3. If each f, is a real linear bounded functional on X and if in some 


set Xq of the second category in X 


sup | f,(x)| < 0, VaeA, VxEeX, 


(7.1.14) 


then the functionals are uniformly bounded, i.e. there exists a B such that for all « 


Ifill < B 
Proof. The assumptions of Theorem 7.1.2 are clearly satisfied and give 
fall = SUP Led! < max {M, M/r} = B 


which is the assertion. Ε 


(7.1.15) 
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At this stage it is appropriate to ask if extensions to complex-valued functionals 
are possible. The answer is in the affirmative. For if 


FAX) = Ga(x) + th, (x) (7.1.16) 


with obvious notation, then g, and h, are real-valued functionals and a condition 
of the form 
sup | f,(x)| < oo (7.1.17) 


implies and is implied by 
sup |g,(x)| < ©, sup |h,(x)| < oo. (7.1.18) 


This gives 


Theorem 7.1.4. If F = {f,(x)} is a family of continuous complex-valued 
functionals on a complex B-space X and if there exists a subset X, of the second 


category in X where 
sup | f.(x)| < 90 (7.1.19) 


for each χε Xo, then there exist a sphere S and a number B with 


[ΚαῚ < B, VaeA, VxeEG. (7.1.20) 


In order to extend Theorem 7.1.2 to complex functionals we must assume that 
condition (7.1.7) holds for the real and for the imaginary part of f,(x). Then the 
proof goes through with trivial modification. In Theorem 7.1.3 we can replace 
‘real’ by “‘complex’’ without affecting the validity of the theorem. 

The following result concerning linear bounded transformations is essentially 
a corollary to Theorem 7.1.2. 


Theorem 7.1.5. Given a family F of linear bounded operators {T,} from one 
B-space X to another Y such that for all « and all x € 3. a subset of the second 
category in X, we have 

sup || T,,(x)|] < οὐ, (7.1.21) 


then there is a finite K such that 


{1} « K, Ve. (7.1.22) 


Proof. We consider the corresponding family G of real functionals {|| T,,(x)||}. 
Here G satisfies the conditions of Theorem 7.1.2 since 


ITZ. + yi <ILCOI+ ILM, [Π7χ(-- Χ}}} = I T.@0I. 


7.2 


Hence 
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| M 
|| < max {M,— Ix] 


and for ||x|| =1 this does not exceed max (M, M/r) = Κ ἢ 


This is known as the Uniform Boundedness Theorem. 


EXERCISE 7.1 


1. Verify that the set Xo of (7.1.10) contains the origin and is closed, connected and 


convex. 


. Why is 2, bounded away from the origin? 


. Show by an example that the conclusion in Theorem 7.1.2 cannot be replaced by 


Ὁ] < Bl|x|| in general. 


. Let Οἱ -- π, 2] denote the set of functions ¢ — f(t), continuous in [—7z, π] with 


,.-- πὴ = f(a) and continued outside of [-- π, x] with period 2x. The metric is 
defined by the sup-norm. Define 


1 
TIS) = ἢ χί' + =} - 70], 


Here lim 7[f](t) = f'(t) exists for f in a dense subset of Cy. Show that 
(i) | 7, | = 2x, Gi) lim 7.0] cannot exist in a subset of Co which is of the second 


category, and (ili) the set of non-differentiable functions is of the second category 
in Co. | 


. The notation used in Section 7.1 suggests that the family of functionals considered 


has infinitely many members. Are Theorems 7.1.1 and 7.1.2 valid and significant for 
finite families or for a single functional? How about the other theorems? 


. Continuous functions are dense in any Lebesgue space. What is the category of such 


a subset? 


. In the space L,(S), where S is bounded, a family of real-valued functionals is defined 


by || fll, 1 <p < 00. What can be said about the category of the subset Ly where 
sup || /||, is finite? 
p 


7.2 TOPOLOGIES 


Let X be a B-space over the complex field C. This implies the existence of a normed 
metric in terms of which X is complete. This metric is often referred to as defining 
the strong topology of X and if lim ||x, — x9|| = 0 we say that the sequence {x,} 


no 
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converges strongly to Xo, which is called the strong limit of {x,}. The relation is 
sometimes written x, “+ Xo or Χρ = limx,. 

Besides the strong topology we shall also have occasion to use the so-called 
weak topology. We recall the existence of linear bounded functionals x* on 3 the 
totality of which form the adjoint or dual space X*. This is also a B-space. We can 
use elements of X* to define neighborhoods in X which are distinct from those 
defined by the strong topology. Some definitions are needed. 


Definition 7.2.1. A non-void set Ὁ is called a Hausdorff topological space if 
there exists a family {N,} of sets N, < § called neighborhoods satisfying the 
following four postulates: 


(H,) For each x € § there is at least one neighborhood N(x) containing x. 
(H,) If N,(x) and N,(x) are neighborhoods of x there is at least one neighbor- 
hood N3(x) of x such that N3(x) < N,(x) A N,(x). | 
(H3) If N(x) is a neighborhood of x and if γε N(x), then there is at least one 

neighborhood N(y) of y such that N(y) < N(x). 


(H,) If x # y there are neighborhoods N(x) of x and N(y) of y such that 
N(x) 0 N(y) = ©. 


This concept was introduced by Felix Hausdorff (1868-1942) in 1914. His 
work on point-set topology was fundamental and he also wrote important papers 
on dimension theory, Fourier series, moment problems, and summability. 

It is clear that any B-space is a Hausdorff space with the spheres having centers 
at X = Χρ as neighborhoods of x9. Moreover, they are linear Hausdorff spaces in 
the sense of the following 


Definition 7.2.2.__A space X is a linear Hausdorff space if (1) it is a Hausdorff 
space as well as a linear space, (2) for each choice of x and y in ¥ andneighborhood 
N(x — y) of x — y there are neighborhoods N(x) of x and N(y) of y such that the 
difference set N(x) — N(y) is contained in N(x — y), (3) for every a€ C,x € ¥ and 
neighborhood N(ax) of «x there are neighborhoods N(a) of « and N(x) of x 
such that the scalar product set N(«) N(x) < N(x). 


We recall that the difference set of two subsets δ, and S, of a linear space is the 
set of differences 
S,—-S,={z3z=x-—y,xeS,;, γε 52}. (7.2.1) 


The definition of the scalar product set is obvious: 
N(a) N(x) = 2; z = By, Be N(a), ye N(x)}. (7.2.2) 


For a in C we can use Euclidean neighborhoods, N(q). 
On a given B-space X we are interested in constructing a weakest topology in 
which each member x* of X* is continuous. (Cf. Problem 4, Exercise 7.2.) We 
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proceed as follows. Let x9 EX, ¢ > 0, n any positive integer, and choose any n 
elements of X*, say x,*, x,*, ..., x,*. Denote the set 


{x; ΧΕ X, [χε (χα — χοῦ) < ε, k =1,2,...,n} (7.2.3) 
by N(Xo; baa x*, 5522 ΧΡ 8). 
Theorem 7.2.1. The family {N,} of all sets of the type (7.2.3) obtained by varying 


all elements (including Χρ and n) satisfies postulates (H,) to (H4). X together 
with the family of neighborhoods {N,} is a linear Hausdorff space. 


Proof. It is clear that ¥ is a linear space since we started out with a B-space. 
Further, the Hausdorff axiom (H,) is satisfied. To prove (H,) consider any two 
neighborhoods N, and N, of Xo, say 


Ν, as N(Xo; Mes x2*, “419 Gare E;), 
(7.2.4) 


= : * 2 *. 
Ν, Ξ N(Xo3 Kika Reo 5 Ss eee se) 


It is, of course, permitted for some functionals in N, to coincide with functionals 
in N,. If now 0 < ¢ < min (é,, 82) we form 


ΝΞ ΔΙ χα ca ee ea sie ΚΕ; (7.2.5) 
Here 
Ix,*(X — Xo)l < ε, 


fork =1,...,m,m+1,...,m+n. This implies 
|x;*(K — Xo)| < &, \x;*(x — Xo)| < & 


fori=1,....m,j=m+1,...,m+n,so that N; <.N, ON, and (H,) is satisfied. 
If now yo € N(Xq), we take 


N(Yo) = N(Yo3 X1*, -.-. Xm*5 δ) (7.2.6) 
with the same functionals as in N(x,) and 6 sufficiently small. It suffices to take 


ὃ < ε -- max [χι (Χο — Yo)| (7.2.7) 
k 


to ensure that Ν( 0) < N(xX,) and postulate (Η 3) is satisfied. Verification of (H,) 
calls for some knowledge of the existence of functionals meeting specific require- 
ments. If x9 and yy are two distinct elements of X, then we can find a linear bounded 
functional x* such that χ (Χο) # x*(y,). See part (3) of Theorem 2.3.2 and the 
discussion in Section 10.3. Set 
|x* (Xo) — X*(Yo)| = ἰχ (Χο — Yo)| = ὃ 

and define the neighborhoods 

N(Xo) = N(Xo; x*; 8), N(Yo) = N(Yo; x*; ε), (7.2.8) 
where 0 < ¢ « $6. This choice implies N(x.) ὦ N(y,) = ©. 
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It remains to prove continuity of addition and scalar multiplication in the sense 
of Definition 7.2.2. Suppose that xy and yo are given together with a neighborhood 
N(Zo) Of Z) = Χο — Yo defined so that ze N(z,) iff 


|x,*(z — Zo)| < ε, i ee aS 
for a given set of functionals x,*. We then set 


N(Xo) = [X; X = Xp + 4(u — Zo), ue N(Z)], 
(7.2.9) 


N(Yo) = Ly; Y = Yo + 4(V — Zp), VE N(Zp)]. 
Then 


N(Xo) — N(yo) = [23 2 =X —y =Z + 4(u — v), ue N(zZp), ve N(z)] 
and 
x,*(K — Y — Zo) = 4x,*(u) — 4x,*(¥). 
This difference is «ς in absolute value so that 
N(Xo) — N(Yo) < N(Zo) = N(Xo — Yo). 


Without restricting the generality we may assume a # Ὁ and x, # 0 in the 
scalar product. Let N(u,) be a neighborhood of uy = «x, where for certain 
functionals x,* we have 


|x,*(u — Uo)| < ε, 2 eee / e 


Let m = max |x,*(x,)| and define 
k 


N(a) = [B; |B — αἱ « ε,1, 


(7.2.10) 
N(Xo) = [x3 |x,*(K — χοῦ < &,k =1,..., n] 
where ᾿ iis 
δ᾽, =—, 8. = -------------, 
2m 2m\a| + ὃ 


If Bx € N(a) N(x,), then 


BX ~ αχρ = (B — αὐχρ + B(X — Xo). 
Hence 


X,* (BX — αχρ) = (B — a) x,*(Xo) + Bx,*(K — Xo), 


the absolute value of which does not reach ε by the choice of N(«) and N(x,), 
whence it follows that N(a) N(x9) < N(axo) and that 3 is a linear Hausdorff space 
in the chosen neighborhood topology. 7 


Some comments are in order. The family {N,} of neighborhoods generate a 
topology on X in which the open sets are arbitrary unions of the sets N,. Note that 
by virtue of (H,), (H2), and (H3), the finite intersection of open sets is open; the 
space X is open and so is the empty set. Call this topology the weak topology. Note, 
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furthermore, that all the sets Νίχο; X,*, ...,X,"3 8) are open, in fact, the class of all 
these sets with x, fixed is a local basis at xq in the weak topology. It is asked in 
Problem 4, Exercise 7.2, to prove the weak topology in 3 is the weakest topology in 
which each member x* of ¥* is continuous. 

While functionals, addition, and scalar multiplication are continuous in the 
weak topology, such a simple function as the norm of x is not continuous in x in the 
weak topology. This follows from the peculiar structure of the neighborhoods. 
For any y € X we can find a linear bounded functional x* such that x*(y) = 0. 
We exclude the trivial case y = 0. It follows that 


Xo + ay € Νίχο; x*; ε) (7.2.11) 


for any choice of a, δ, Xo. Thus this neighborhood of xg contains elements of X of 
arbitrarily large norm and this precludes continuity of the norm in the weak 
topology. 

We have now various notions of convergence and completeness associated 
with the weak topology. 


Definition 7.2.3. The sequence {x,} < X is said to be weakly convergent if 
{x*(x,)} is a convergent sequence for each choice of x* €X*; it is weakly 
convergent to Χο € 3 if lim x*(x,) = x*(Xo) for all x*. 
n~ oO 
It is clear that a strongly convergent sequence is also weakly convergent, but 
the converse is normally not true. Thus in L,(—72, 2) the sequence {65} converges 
weakly to zero but not strongly. We have | 


x*[e"] = [ e"*g(t) dt (7.2.12) 
for some g € L, depending upon x* and the Fourier coefficients of any function in 
L, go to zero. On the other hand, the norm of εἰ is (2z)'/? for all ἡ. 


Definition 7.2.4. A B-space is said to be weakly complete if every weakly 
convergent sequence in X converges weakly to an element of X. 


There is a class of B-spaces which have the property of weak completeness, 
namely the reflexive spaces, so named by E. R. Lorch in 1939. The earlier name was 
the much-overworked term regular, due to Hans Hahn (1879-1934) in 1927. To 
explain this notion we must digress on (conjugate) dual spaces. 


We note that ¥* being a B-space also has a dual space denoted by X** and 
called the second dual (second conjugate) of X. Its elements are denoted by symbols 
like x**. Actually we can specify some of the elements of ¥** by taking a sharper 
look at the symbol x*(x). This expression is linear in each of its two arguments x 
and x*. For a fixed x we can vary x* over X*. This is now a linear functional on 
X*. Thus we can define 

ΧΟ = x" (xX), (7.2.13) 
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This functional on X* is linear. It is also bounded for 
PFO) = Ix*(x)] < |]x*]] |x], 


so that ||x**|| < ||x||. Actually equality must hold here, for by Theorem 2.3.2 
there is an χε ¥* such that ||x*|| =1 and x*(x) = |x]. Thus x** is a linear 
bounded functional on 3 ἢ and an element of ¥**. 

Formula (7.2.13) defines a mapping of ¥ onto a subset X, of ¥**. This map- 
ping is an isometry, |x|] = ||x**|| and thus 1-1 and onto. Further, xo x**, 
y <> y** implies ax + By «» ax** + By**. Thus the mapping is an isomorphic and 
isometric homeomorphism. This gives 


Theorem 7.2.2. Ἐς X**. 
Definition 7.2.5. X is said to be reflexive if ¥** = &. 


Let us consider in passing an application of the uniform boundedness principle 
based on the second dual. 


Theorem 7.2.3. If for every x in some set S < X and for every x* ε ¥* we have 
sup |x*(x)| < oo, (7.2.14) 
xeS 


then S is a bounded set. 
Proof. We use the standard embedding of X into ¥**. Then for x fixed in S 
| x*(x) = x**(x*) = y(x*), 


where y is an element of X uniquely determined by x. Thus for each x € S there is a 
unique linear bounded transformation y on X* to C and, furthermore, 


sup |y(x*)| < 00 
y 
for each x* € X*. By Theorem 7.1.5 this implies that the norms of this family of 
linear transformations are uniformly bounded. Hence there is an M such that 
llyl| = |x**]| = [xl <M 
for all xe δ. This shows that S is bounded. Jf 


Corollary 1. A weakly convergent sequence in X is bounded. 


For sup |x*(x,)| < oo for each x*. 
n 


We can go one step further using 


Definition 7.2.6. A subset S of a B-space is said to be sequentially weakly 
compact if every sequence of elements in S contains a subsequence which converges 
weakly to an element of S. 


Corollary 2. A sequentially weakly compact subset S of a B-space 3 is bounded. 
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Proof. The assumption implies that for each fixed x* e X* the set of complex 
numbers {x*(x); ΧῈ S} is sequentially compact. Now a set of complex numbers 
has this property iff it is bounded and closed. The boundedness condition is simply 
(7.2.14) so the conclusion follows. ἢ 


Theorem 7.2.4. A reflexive space is weakly complete. 


Proof. Let {x,} be a weakly convergent sequence in the reflexive space X. 
For any given ¢ > 0 there is then an integer N depending upon « and the functional 
x* € X* such that 

[x*(x,,) — x*(x,)| < 6; mn > N. (7.2.15) 


Since X is reflexive, the embedding mapping X¥ > X** = X leads to a sequence 
{y,} © ¥ such that x, oy, and we have 


x*(x,) = y,(x"): (7.2.16) 
Since for each x* 
it is seen that the sequence of complex numbers {y,(x*)} is convergent so that 
lim y,,(x*) (7.2.17) 


exists for every x*. By Theorem 7.2.3 this implies the existence of a finite positive 
M such that 


yall <M. (7.2.18) 


Note that each y, defines a linear bounded transformation from ¥* to C and the 
norms of these transformations are uniformly bounded by Theorem 7.2.3. Set 
lim y,,(x*) = yo(x*). (7.2.19) 


Here yo defines a linear transformation on X* to C which is bounded by (7.2.18) so 
that yo ε X** = X. Now in the mapping x «Ὁ» x** there is a unique element x, ε X 
which corresponds to yg and 


Yo(x*) = x*(Xpo). (7.2.20) 
Finally, for fixed x* : 


δ 5 lim ly,,(x*) — yn(x*)] = lyo(x*) — ya(x*)| = [x*(xo) — χα). 


m-> co 


Since 8 is arbitrary, this shows that x, converges weakly to xy, whichis often written 


X,——> Xo. | (7.2.21) 


Thus it has been shown that in a reflexive space a weakly convergent sequence has a 
weak limit. Jj 


Theorem 7.2.5. Α B-space & is reflexive iff its dual X* is reflexive. 
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Proof. If X¥ is reflexive, then X = X** in the natural embedding. On the other 
hand, 
ee ye Se SS (7.2.22) 


Since the second dual of ¥* cannot be a proper subspace of ¥* under the natural 
embedding, the inclusion in (7.2.22) must reduce to equality so that X* is also 
reflexive. 

Suppose now that X* is reflexive. Then so is X**. Now we have ¥ ας X** and 
since X* = (X**)*, the dual spaces of X* and X** are identical. If X were a proper 
linear subspace of ¥**, then by Theorem 2.3.2 there would exist a linear bounded 
functional on X** of norm one which is zero on X. The statement that ¥ and X** 
have the same linear functionals must imply that there is one and only one linear 
bounded functional which takes on given values on X. Such a functional can be 
extended to all of ¥** in one and only one way. Now the zero functional on ¥ 
admits as extension to X** the zero functional on X**. Since there is a unique 
extension our functional must be the zero functional on ¥**. This shows that 3 
cannot be a proper linear subspace of X**. Thus ἃ = X** and X is reflexive. i 


We state without proof one more important property of reflexive spaces. 


Theorem 7.2.6. The closed unit sphere of a reflexive space is weakly sequentially 
compact. 


We note that a Hilbert space 15 reflexive by virtue of Theorem 2.5.11. 

The rest of this section will be devoted to a brief discussion of operator 
topologies. If 3 is a complex B-space we denote by ©(X) as usual the B-space of all 
linear bounded transformations T from X¥ into X. We have available at the outset 
the normed operator topology on €(). In this context this is usually referred to as 
the uniform operator topology based on the distance 


A(T,, T2) = ||T, — Tall. (7.2.23) 
But in addition we can introduce various neighborhoods in analogy with 


Definition 7.2.3 above. 


Definition 7.2.7.__A strong operator neighborhood of Το € €(X) is any set of the 
form 
N(T 3X1, ...» Χρ; 8) = (T3 || T(K,) — To(x,)|| < ε; Καὶ =1,...,n), 


where X,, ..., X, are n arbitrary elements of 3. 
A weak operator neighborhood of Το is any set of the form 


N(T9; ΧΙ, +++) Xm, χα, τ τω. 8) -- [ΤΙ Ix;*LT(x,)] "" x;*[To(x,)] < 8], 
where X,, ...,X, and x,*, ...,X,* are arbitrary elements of X and X* respectively. 


The set of all strong operator neighborhoods constitutes a neighborhood 
covering of &(X) in terms of which this space becomes a linear Hausdorff space in 
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the sense of Definitions 7.2.1 and 7.2.2. The same holds for the set of weak operator 
neighborhoods. The argument, which largely parallels the proof of Theorem 7.2.1, 
is left to the reader. | 

Now G(X) is not merely a B-space, it is a B-algebra in terms of the uniform 
topology. This raises the question of corresponding structural properties of €(X) 
in terms of the strong and the weak operator topologies, respectively. Here the basic 
concept is that of a topological algebra. Since we restrict ourselves to Hausdorff 
topologies, we may limit ourselves to Hausdorff algebras. 


Definition 7.2.8. An algebra UX is a Hausdorff algebra if (1) U is an algebra as 
well as a linear Hausdorff space and if (2) to every pair x,y ¢€W and each 
neighborhood N(xy) of xy there are neighborhoods N(x) of x and N(y) of y such 
that xN(y) < N(xy) and N(x)y < N(xy). 


It should be noted that YW is normally non-commutative as in the case &(X) 
under consideration. Hence we have to write xN(y) and N(x)y. The first is the set 
of products xv with v e N(y), the second the set of products uy with ue N(x). Note 
also that we do not demand that N(x) N(y) «- N(xy). We have now 


Theorem 7.2.5. ©(X) is a Hausdorff algebra in terms of the strong operator 
topology as well as in terms of the weak one. 


The proof is left to the reader. 


EXERCISE 7.2 


1. Verify that condition (7.2.7) suffices for the stated purpose, N(yp) < N(Xo). 

2. Same question for (7.2.8). 

3. Fill in missing details in the proof of N(x) — ΝΟ) < M(x — y) and of 
N(a) N(x) « N(x). 


4. Let T be the weak: topology on a given B-space X. Prove that (a) With respect to T, 
any x* € X* is continuous, and (ὁ) On 3 let U be a topology in which each x* € X* 
is continuous, then T © U. Here the inclusion asserts that every set open with respect 
to U is open with respect to T. 


5. Construct the weak topology for the space Wt, of n by n matrices of complex 
numbers. 


6. Show that Wt, is a Hausdorff algebra in the sense of Definition 7.2.8 in the weak 
topology. 


7. Show that the sequence space /, is reflexive for 1 < p < 00. 
8. Prove that the Lebesgue space L,(a, δ) is reflexive for 1 < p < oo. 
9. What is wrong with p = 1 in the preceding problems? 
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10. Show that the sequence { n= b. 2. os in C[O, 1] is weakly convergent but has no 
weak limit. [Hint: Recall the form of the linear bounded functionals on this space. ] 


11. Prove Theorem 7.2.6. 


12. From Theorem 7.2.6 and the result in Problem 10 conclude that C [0, 1], more 
generally C[a, b], is not reflexive. 


13. Is C” reflexive? 
14. Taking Wit, as €(C"), discuss the strong and the weak operator algebras on C’”. 


15. Prove for a general ©(X) that it is a linear Hausdorff space in terms of the strong 
operator topology. 


16. Same question for the weak operator topology. 
17. Prove Theorem 7.2.7 for the strong operator topology. 
18. Give a definition of the strong operator topology for the space G(X, 9)). 


19, Same question for the weak operator topology. 


7.3 VECTOR-VALUED AND OPERATOR-VALUED FUNCTIONS 


We shall be concerned with mappings of R* into either a B-space 3: over the complex 
field or into a space €(X, Y) of linear bounded operators from Ἐ into ὃ), two 
B-spaces over the complex field. A function of the first kind will be called a vector 
function and denoted by x(s) or similar symbol. Functions of the second kind are 
called operator functions and denoted by T(s), U(s) or the like. We can look upon 
such functions as a family of elements of ¥ or of €(X, 9), respectively, where the 
individual elements are indexed by a real parameter s. 

Such objects have occurred repeatedly in previous chapters of this treatise. 
Thus an 7 by n matrix A(s) = (a;,(s)), where the elements are complex-valued 
functions of a real variable s, occurs in a number of places, as solutions of implicit 
matrix equations, as solutions of matrix differential equations, or as the resolvent 
of some constant matrix. Here A(s) figures as a vector function in the B-space M,. 
But the same matrix-valued function may be regarded as an element of &(C") 
defining a vector function y(s) € C” by 


y(s) = A(s) x, 
where x eC”. In this setting, A(s) is an operator function. 
Now vector and operator functions may have continuity properties with respect 
to 5. Since we admit two types of topology in X and three in €(X) (the same types 
apply in €(X, Y))), we have a corresponding variety of notions of continuity. 


Definition 7.3.1. A vector function x(s) which is defined on a subset S of 
R* with values in the B-space Ἃ is (1) weakly continuous at s = SoES if 
lim |x*[x(s)] — x*[x(So)]| = 0 for each x* € X*, (2) strongly continuous at so 


SSO 


if ταν Ix(s) -- Χ(9)}} = 0. 
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Definition 7.3.2, An operator function T(s) from S < Εἰ to ©(X, 9) is continu- 
ous at S = So in the sense of (1) the weak operator topology if lim | y*[T(s)x] — 
y*LT(So)x]| = 0 for each xe X, y* ε ἢ ἢ, (2) the strong operator topology if 
lim ||T(s)x — T(so)x|| = 0 for each xeEX, and (3) the uniform operator 
S—SO 

topology if lim ||T(s) — T(so)|| = 0. Ifno confusion is likely to arise, we speak 


SSO 


of weak, strong or uniform continuity in these cases. 


In general these notions are distinct. In this respect the special case mentioned 
above is misleading for the weak continuity of the vector function A(s) implies 
strong continuity and the weak continuity of the operator function A(s) implies 
strong continuity, which in its turn implies uniform continuity. 

For differentiability we have a similar wealth of possibilities. We state the 
definition for vector functions and let the reader formulate corresponding notions 
for operators. 


Definition 7.3.3. A vector function x(s) from the interval (a, b) to the B-space X 
is weakly (strongly) differentiable at s = So if there is an element χ' (50) € ¥ such 
that the difference quotient h~ *[x(so + h) — x(so)] tends weakly (strongly) to 
Χ΄ (50) ash + 0. We call χ' (59) the weak (strong) derivative of x(s) at 5 = So. 


Here there is still another possibility. We could demand that all the functionals 
x*[x(s)] are differentiable at s = sy. This will be the case if x(s) is weakly differ- 
entiable at s = sy, but the stated condition does not imply the existence of a weak 
derivative. 


Theorem 7.3.1. If the weak derivative of x(s) exists and is zero everywhere in 
(a, b), then x(s) is a constant. 


Proof. The assumption is that 


d 
— x*[x(s)] = 0, Vx ea: 
ds 
Hence 
| x*[x(s)] = x*[x(so)] 
for a fixed 50 € (a,b). This says that 
x*[x(s) — x(So)] = 0, Υ χἕε x. 


But the zero element of X is the only one that can be annihilated by all functionals 
x*eX*. This gives x(s) = x(so), a constant. Jj 


We shall have to consider Riemann-Stieltjes integrals with a vector: or operator- 
valued function as integrator. This requires the notion of abstract-valued functions 
of bounded variation. We give the definitions only for vector-valued functions. 
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Definition 7.3.4. Avector function x(s) from the interval |a, b| to the B-space 3 is 
of (1) weak bounded variation in [.a, δ] if x*[x(s)] ε BV [a, b] for every x* € X*, 
(2) bounded variation if sup || Σ; [x(t;) — x(s;)]|| < οὐ for every choice of a 
finite number of non-overlapping intervals (s,, t;) in La, δ] and (3) strong bounded 
variation if sup Σ; ||x(s;) — x(s;—1)|| < 00 where all possible partitions of [a, δ] 
are considered. The two suprema are known as the total and the strong total 
variation of x(s) respectively. 


We leave it to the reader to verify that strong bounded variation implies 
bounded variation and that the latter implies weak bounded variation. 


Theorem 7.3.2. If x(s) is of strong bounded variation in [a,b], then x(s) can 
have only a countable number of discontinuities and has one-sided limits every- 
where in [a, b]. 


Proof. Let V(t) denote the strong total variation of x(s) in the interval [a, ¢], 
a<t<b. This is a real non-negative, non-decreasing bounded function of ¢. As 
such it has left and right hand limits everywhere in (a, δ). a right hand limit at t = a, 
and a left hand limit at ὁ = ὁ. Further, V(t) can have only a countable number of 
discontinuities. Since 


0 < |x(s2) — x(s)I| < V(s2) -- Κι), 54 < 5), (7.3.1) 


we see that x(s) is left hand (right hand) continuous wherever V(s) has this 
property. In particular, x(s) has at most a countable number of discontinuities. 
We have to show that one-sided limits exist. 

Let 5 τίοδο be a discontinuity of V(s). Τὸ fix the ideas, suppose that 
V(sq + 0) — Κῶ) = A > 0. Suppose that 0 < « < B where β is small. V(s) need 
not be continuous in [80 + α, So + B], but its increase in the interval is small as soon 
as β is small regardless of the value of « > 0. Suppose that 


V(so + B) — V(s9 +a) < ὃ (7.3.2) 


for all « > 0. We now take two sequences {s,} and {t,} in (59, 59 + B) which both 
descend to sg and consider the corresponding sequences {x(s,)} and {x(t,)}. It is 
desired to prove that they are convergent and, in fact, converge to the same limit 
in &. 

Convergence follows from the inequality 


Σ |Ix(s,-1) — x(8)I| < V(Sp—1) — ΚΟ < δ, 


which holds for all n > m. Since ὃ 15 arbitrary, this implies that lim x(s,) = x, 
k~> a 


exists and the existence of lim x(¢,) = x, 15 proved in the same manner. 


k—> oc 


If, now, x, # X,, we can use the fact that x(s,,) 15 close to x, while χά.) is close 
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to x, to infer that an integer k, arbitrarily large, may be found such that 


IIx(s,) — X, || < $x. — x1), x(t.) — X2ll < ξ||Χ2 — xq]. 
Then 
Ix(4,) — x(s,)l] = |x(@,) — X2 + Χ) — X; + x, — x(5,)|| 
> |x. — x, || — I|x(@) — Xall — Ilx(s,) -- xill 


> $x. — Xi]. 
On the other hand, 
I|x(t,) — χω! < ὃ 


by the choice of the sequences. Here ὃ is arbitrarily small so that the assumption 
X, τέ X, leads to a contradiction. It follows that x(s) has a unique right hand limit 
at 5 = So and the existence and uniqueness of left hand limits is proved in the same 
manner. ἢ 


Weak and strong properties are never independent of each other. In the present 
case we know that bounded variation implies weak bounded variation. As shown 
by N. Dunford and I. Gelfand, independently of each other, in 1938, the converse 
is also true. 


Theorem 7.3.3. Α function of weak bounded variation is of bounded variation. 


Proof. The argument is an application of the uniform boundedness principle. 
Let V[f] denote the total variation in [a,b] of a numerically valued function 
fe BV[a, b]. We have then for any choice of a finite number of non-overlapping 
intervals [a;, δ] in La, 6] 


| DEX) - xa] 


<  [x*[x(d,)] — x*[x(a)]l < V {x*Lx(s)]}. 


The last member is finite for every χε X*. We can then use Theorem 7.2.3 to 
conclude that there is an M > 0 such that 


| 
» [x(6;) — ΧΩ «Μ 
J 
for each choice of intervals [a,, b,;] and this shows that x(s) is actually of bounded 
variation. fj 


On the other hand, it may be shown by examples that a function of bounded 
variation need not be of strong bounded variation. See Problem 7, Exercise 7.3. 


EXERCISE 7.3 


1. If A(s) = (a;,(s)) is an n by n matrix where a,,(s) is a complex-valued function 
defined fora < s < ὃ. consider A(s) as a vector function with values in Wt,. When is 
A(s) (1) weakly continuous, (2) strongly continuous? 
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. Consider A(s) as an element of &(C") and find necessary and sufficient conditions for 


weak, strong, and uniform continuity. 


. Prove that a weakly (strongly) differentiable function x(s) is weakly (strongly) 


continuous. 


. Prove that if x(s) is weakly continuous in [a, 5], then ||x(s)|| is bounded in [a, δ]. 


5. Prove that strong bounded variation implies bounded variation which implies weak 


14. 


7.4 


bounded variation. 


. Consider a partition of the interval [0, 1] by x + 1 points 55 = 0 < 5) « 5} <... 


« 5, =1 and form the sum 


δ᾽ (κκ — σκ--4)}. 
κΞῚ 


For n fixed, find the supremum of all such sums and show that this is an unbounded 
function of ἡ. 


. Define x(s) for 0 < s <1 as the characteristic function of the interval [0, 5] so that 


x(s) = f(s, t) equals 1 ἴογ 0 < 2 < s and 0 elsewhere. Let X be the space L,(0, 1). 
Show that x(s) is of weak bounded variation in [0, 1] and hence of bounded variation. 
Use the preceding problem to show that x(s) is not of strong bounded variation. 


. Is the situation different in L(O, 1)? 


Let x,E€X, Vn, andsetS, = 22_, x,. The infinite series 2x, is said to converge 
weakly (strongly) [to 5] if the sequence {S,,\ converges weakly (strongly) [to s]. The 
series is absolutely convergent or convergent in norm if Σ ||x,|| converges, uncon- 
ditionally weakly (strongly) convergent if every subseries is weakly (strongly) conver- 
gent. 


. If & x, converges weakly, show that |S, || is uniformly bounded. 
10. 


Show that a numerical series is unconditionally convergent iff it is absolutely 
convergent. 


. Show that a weakly unconditionally convergent series is necessarily strongly un- 


conditionally convergent. 


. Show that in the space /, weak convergence implies strong convergence. 


. A series Σ x,(s), a < s < b, with elements in ¥ may have convergence properties 


holding uniformly with respect to s. Formulate a definition of uniformly weak 
convergence, etc. 


Do the same for operator series Σ 7,(s). 


ABSTRACT RIEMANN-STIELTJES INTEGRALS 


Such an integral involves two functions: an integrand and an integrator. Suppose 
that one of these functions has values in a B-space X which is not a B-algebra. Then 
the other factor is normally restricted to be numerically valued. We have two types 
of integral according as the integrand or the integrator takes on values in X. The 
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integral, just as in the classical case, is defined as the limit of a class of Riemann- 
Stieltjes sums. Here are various possibilities corresponding to the several topologies 
available in X that may be used to define the limit. Actually the two types are 
coexisting for the same choice of limiting process, as may be shown by an obvious 
generalization of the classical formula for integration by parts. 

Basically we start with a vector function f(s) with values in ¥ and a numerically 
valued function g(s), both defined in an interval [a,b], where they are to satisfy 
conditions to be specified later. We consider a partition of [a, 6] 


So= α 5) <5, 5°: «5, Ξ ὁ (7.4.1) 
and a choice of intermediary points {t;} with 
Sj-1 5 ἢ 5 5 


; i G2 es (7.4.2) 


The collection of these 27 +1 real numbers is denoted by z and we set 
|x| = max (5; — s;_,). We then form the two Riemann-Stieltjes sums 
j | 


5,(1, 6) = Σ ΤΟ [σοὺ -- 9(5)-1)] (7.4.3) 
S,(9.1) = Σ a(t) τοὺ ~ Το). 0] (7.44) 


and consider either the strong topology in ἢ or the weak one. 


Definition 7.4.1. If in the chosen topology S,(f,g) should have a limit as 
[π| - 0, then this limit is the Riemann—Stieltjes integral 


b 
[ f(s) dg(s) (7.4.5) 
with respect to this topology. 


Definition 7.4.2. If in the chosen topology S,(g,f) should have a limit as 
|x| + 0, then this limit is the Riemann-—Stieltjes integral 


[a df(s) (7.4.6) 


with respect to this topology. 
As indicated above, we have 


Theorem 7.4.1. If either integral exists with respect to the chosen topology, then 
both exist in this topology and 


᾿ b 
. [Ὁ dg(s) = f(b) g(d) — f(a) g(a) -- { g(s) df(s). (7.4.7) 
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Proof. As in the classical case, we have 


Σ £04) τσοὺ — σύ. 0) = £6) 9) -- f(a) g@ 
ὑπ (7.4.8) 


:- Σ g(s;) [f(t;4 Wr f(7;)], 


where ty = a, t,4, = 5. The second sum is of type (7.4.4) for 
to Ξα Κ΄, SiS Shei Ξ ὁ 


and t; < 5; Ξ t;4,. Further, max |t;,, — ¢;| < 2|π|. Hence, if one of the sums has 
a limit, so does the other. Jj 


This, of course, raises the question of the existence of one of the limits. In the 
classical theory the integral (7.4.5) exists if fis continuous and g is of bounded 
variation, both in [a,b]. The following theorem is a natural extension of this 
prototype. 


Theorem 7.4.2. Suppose that either (1) f is a strongly continuous vector function 

from {a, b] to ¥ and g is a numerically valued function of bounded variation on 
[a, Ὁ] or, alternatively, (2) fis a vector-valued function from [a, b] to ¥ of bounded 
variation in the sense of Definition 7.3.4 while g is a continuous numerically 
valued function on [a,b]. Then the integrals (7.4.5) and (7.4.6) exist in the 
normed topology of 3. 


Proof. The argument in case (1) is patterned on the classical case. Since f is 
strongly continuous in [a, b], we have uniform strong continuity. That is, given an 
é > Q, there is a ὃ > O such that 


f(s,) — [(52}} << e if |s; — σι] <6. (7.4.9) 


If we now take two sums of type (7.4.3), one with set z, and the other with set z, 
with [π|] and |z,| < 46, then the usual argument, with absolute values replaced by 


norms, gives 
IS,, -- S,,|| < 2eV,’[g]. (7.4.10) 


This proves the existence of the strong integral (7.4.5) in case (1). 
In case (2) g is uniformly continuous in [a, 6], and choosing 8 and ὃ as above 
we see that for any x* € X* we have 


Ix*[S,, — S,,]] < 2eV,"{x*[f(s)]}, (7.4.11) 


where, as above, |z,| and |z,| are <46. The right hand member involves the total 
variation of the numerically valued function x*[f(s)], and this should be compared 
with the total variation of f(s), which is known to be finite. 

Some preliminary considerations will be helpful. It is known that a complex- 
valued function h(s) ε BV[a, b] admits of a representation of the form 


h= h, — h, + ith, = h,) = hs + ig, (7.4.12) 
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where hh, to h, are non-negative and non-decreasing. Moreover, 
ΜΠ] = V[h,] + VA], VAs] = Vih3s] +  ΚΠ|4] 


for a suitably chosen decomposition. Further, 4, and h, (ἢ, and h,) are never 
increasing for the same values of s; if one member of a pair is increasing in an 
interval, the other keeps a constant value. 

Consider now a finite set of non-overlapping intervals [a;,b;] in [a,b]. Ifa 
particular interval [a;,b;] is an interval of constancy of "2, then h,(6,) > h,(a,). 
If in 


Σ Usb) ~ bs(ay] 


we extend the summation over intervals of constancy of h, the sum reduces to 
n 
4 [h, (5;) " h,(a;)]. 
j= 


By taking sufficiently many such intervals we can get arbitrarily close to V[h, ]. 
Similarly, we can approximate V[h,] by summing over intervals of constancy of h,. 
This argument also applies to ἧς and hence to A itself. It follows that 


sup πον V [A, }. (7.4.13) 


Σ [κόρ -- hay) 


But 
V[A] « V[A,J + VEAL] + [12] + Vig) « TX V [A, |. (7.4.14) 


This gives, finally, 
V[A] < 4sup| 2 [λ(δ)) — A(a,)]]. (7.4.15) 


We now apply this inequality to A(s) = x*[f(s)] and obtain 


V {x*[8(5)]} < 4 sup by x*[f(b)] — x*[f(a)] 


= 4sup | x* Σ᾽ [f(o) -- (α)]] : 
Δ j=l 
Here the subscript A indicates that the supremum is taken for all ” and all choices 
of n non-overlapping intervals in [a,b]. Now 
IS,, —S,,] = sup |x*[S,,-S,,]| <2e sup V{x*[f(s)]} 
[]x*]]=1 


[|x*][=1 


< 8e sup sup x | Σ [[(6}) -- f(a)]| 
[χη i= 
< 8esup Σ [[(ὐ}) -- 


= βε V Tf] 
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in terms of the total variation of the vector function f. This shows that the integral 
in case (2) also exists in the strong topology. §j 


The proof is much simpler in case (2) if f is of strong bounded variation rather 
than just bounded variation. 

The integrals (7.4.5) and (7.4.6) are elements of the B-space X, and as such they 
may be operated on by any element T of an operator algebra ©(X, Y)). There are 
two operators involved here, T and J, and the question arises whether or not they 
commute. Can T be taken under the sign of integration and applied to f? In 
integrals of the second type we want to go a step further: T[df(s)] looks mysterious, 
while d[Tf(s)] may possibly make sense. The following theorem gives an answer 
to this question. 


Theorem 7.4.3. If £ is a strongly continuous mapping of [a,b] into X, if 
g € BV{a, b] and if T € €(X, Y), then T[f](s) exists as an element of Ἢ and is 
strongly continuous in [a,b]. Further, 


[ T[f](s) dg(s) = {[ 1) dg(s)| : (7.4.16) 


If, instead, f(s)eX and is of bounded variation in [a,b], if g € C[a, b], 
T € &(X, Y), and if T[£](s) is of bounded variation in [a,b], then 


[τ a{T[f](s)} = | [ 90s) ar(s)| (7.4.17) 


Proof. Here 
ITHIG) — THI@I < ITI {{) -- (Η]} 


since T is linear and bounded, and f is strongly continuous in [a, b]. It follows that 
T [f](s) is also strongly continuous. Hence both sides of (7.4.16) exist. Equality 
then follows from the linearity of T which gives the corresponding identity for the 
Riemann-Stieltjes sums 


Σ TAG) τσοὺ -- a= ΤΊ Σ τω Cots) -- a(s,-s)]}. 


Passing to the limit with |z| we get (7.4.16). In the second case both sides exist by 
assumption and Theorem 7.4.2. Again we have equality for the Riemann-Stieltjes 
sums and in the limit (7.4.17) results. Jj 


Corollary. If f(s) is of strong bounded variation, so is T[f](s), and (7.4.17) 
holds. 
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Actually the theorem holds under more general assumptions, but we shall not 
examine this possibility here. 

An important special case is that in which T is a linear bounded functional 
x* e X*. Here x*[f](s) is automatically of bounded variation if f is of bounded 
variation, so both identities hold with T replaced by x*. 

It remains to say a few words about operator-valued integrals. Suppose 
U(s) € €(X, Y) for each sin [a, b]. According as U(s) is continuous in the uniform, 
the strong, or the weak operator topology, we can form the integrals 


b b b 
| U(s) 496), { U(s)[x] 496), [ y*{U(s)[x]} dg(s). (7.4.18) 


Here g € BV [a, b] while x is any element of 3, y* any element of 9)*. We leave to 
the reader to produce integrals of the second kind where g € C[a, δ] and U(s) is of 
the appropriate type of bounded variation. 


EXERCISE 7.4 


1. Verify (7.4.14). 


2. Why is ||x|| = sup |x*(x)|? 
[[x*][=1 


3. Givea proof of Theorem 7.4.2, case (2), assuming f to be of strong bounded variation. 
4. Prove the Corollary to Theorem 7.4.3. 


5. In the proof of (7.4.13) what is the passage from ἐς and ἧς which justifies replacing 
them by h? 


6. Prove the existence of the integrals in (7.4.18). 


7. If U(s) is continuous in the uniform operator topology of G(X, 2), prove that 


b b 
{| U(s) ay(s)| [x] =| U(s)[x] dg(s). 


a a 


8. When is it true that 
b b 
y* {| U(s)[x] ag(s)| =| y* {U(s)[x]} de(s)? 


9. What do operator integrals of the second kind look like? Give sufficient conditions 
for existence. 


10. Abstract Riemann-Stieltjes integrals are bilinear operations. They are linear in the 
integrand as well as in the integrator. Write out the formulas and give a verification. 
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11. Under the assumptions of Theorem 7.4.2 show that 


b 
Ι f(s) dg(s) 


a 


< max ||f(s)|| V.’Ig1. 
b 


αΞ 53 


12. Is there a corresponding estimate for integrals of the second kind? 


7.5 BOCHNER INTEGRALS 


Before we can introduce the Bochner integral of a vector-valued function, we briefly 
discuss how to abstract the most important properties of Lebesgue measure and 
Lebesgue integration treated, in geometric language, in Chapter 4. 

The notion of a measure space, here denoted by (S, A, μ). is given in Section 4.1. 
While the Lebesgue measurable function serves as a prototype, the concept of an 
A-measurable function on an abstract measure space (δ, A, μὴ) can be formulated 
similarly. Consequently the lemmas and theorems that are proved in essentially 
the same manner as those of Chapter 4 are stated here without proof. 


Definition 7.5.1. Let (S, A, μὴ be an abstract measure space. An extended real- 
valued function defined on S is called A-measurable iff for each real number a, 
the set {x; f(x) > αἱ is A-measurable, i.e. {x; f(x) > af eA. 


(See Definitions 4.1.1 and 4.2.1.) 


Lemma 7.5.1. Let f be an extended real-valued function defined on S. The 
following statements are equivalent: 


1) {x; f(x) >abeA, VaeR. 
2) {x; f(x) zabeA, VaoeR. 
3) {x; f(x) «αἱ ει, VaeR. 
4) {x; f(x) «αἱ εὰ, VaeR. 


(See Lemma 4.2.1.) 


Lemma 7.5.2. Let c be a real number. Suppose that the real-valued functions 
f and g are A-measurable, then so are the functions cf, f *f+9,f-9;f9,\f\. 
Moreover, if {f,} is a sequence of A-measurable functions, then supf,, int f,, 
lim inf f,, and lim sup f, are all A-measurable. 


(See Lemmas 4.2.2 and 4.2.3.) 


A real-valued A-measurable function is simple if it has only a finite number of 
values (see (4.3.3)). It is clear that a simple A-measurable function g can be 
represented in the form 


g= Σ, %Xs,> (7.5.1) 
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where «; Ε R and xg, is the characteristic function of a set δ; in A. Among these 
representations for g there is a unique standard representation characterized by the 
fact that the «, are distinct and the δ᾽, disjoint. Indeed, if «,, «2, ..., ἄρ. are the 
distinct values of g, and if δ᾽ = {x; χε δ, g(x) = a,}, then the S, are disjoint and 
S = UFa1 δ). 

It happens quite often in the process of abstraction that the formulation of a 
generalized definition is suggested by properties of the concrete cases. Here we 
present the following | 


Definition 7.5.2. If g is a non-negative simple function on S with the standard 


representation (7.5.1), the integral of g with respect to μ is defined to be the 
extended real number 


| g du = Σ᾽ αιμ(δ). (7.5.2) 
j=l 


(See (4.3.4).) 


As a direct consequence, the following can be verified easily. 


Lemma 7.5.3. (1) If g and ἢ are non-negative simple functions on S, and if 


a > 0, then 
fog du =afg dp, (7.5.3) 
[( -- ἢ) ἂμ τε [4 ἂμ -- [ἢ ἀμ. (7.5.4) 
(2) If v is defined for E in A by 
ν(Ε) = [σχε dp, (7.5.5) 


then ν is a measure on A. 


Definition 7.5.3. If f is a non-negative extended real-valued A-measurable 
function on S, the integral of f with respect to μ is defined to be the extended real 
number 

[ἀμ = sup [9 dp, (7.5.6) 


where the supremum is extended over all simple functions g on S satisfying 
0 < g(x) < f(x) for all x ES. If f is a non-negative extended real valued 
A-measurable function on S, if E belongs to A, then ἔχε is also a non-negative 
A-measurable function on S and the integral of f over E with respect to μ is defined 
to be the extended real number 


[fae = [γχεαμ 


(See Definition 4.3.2, Theorems 4.3.3 to 4.3.6.) 


At this moment of processing abstraction, we must check if Definition 7.5.3 
really extends Definition 4.3.2. Suppose that f is a non-negative extended real- 


238 REAL ANALYSIS IN LINEAR SPACES 7.5 


valued Lebesgue measurable function defined on R”. According to Definition 
4.3.2, the Lebesgue integral of f with respect to the m-dim Lebesgue measure, μι 
is equal to [ f(P) dP = [2ἀμ,, = Um+1LQo(f; S)]. By virtue of Theorems 4.3.4— 
4.3.6 it can be verified easily that 


(Sf dup = Mn+ 1LQo(F3 S)] = sup J g ὧμ,.» (7.5.7) 


where g ranges over all simple functions withO<g <f. 
The next lemma follows directly from Definition 7.5.3. 


Lemma 7.5.4. (1) If f and g are non-negative extended real-valued 
A-measurable functions on S with f < g, then 


\fdu< fg dp. (7.5.8) 


(2) If f is a non-negative extended real-valued A-measurable function on S, and if 
E, F belong to A with E © F, then 


[fa < [, fdu. (7.5.9) 


We are now prepared to establish an important theorem which provides the 
key to the fundamental convergence properties of integration. 


Theorem 7.5.1 (Monotone Convergence Theorem). Let { f,} be a monotone 
increasing sequence of non-negative extended real-valued A-measurable functions 
defined on S. Let f be the limit function of {f,\. Then 


{fdu = lim Jf, du. 


no 


(See Theorem 4.3.5.) 


Proof. According to Lemma 7.5.2 the limit function f is A-measurable. Since 
Tn <Sn+1 ΞΖ it follows from (7.5.8) of Lemma 7.5.3 that 


Shan <Jharde <ffdu 
for all positive integers n. Therefore 
lim [f,du < [} ἀμ. 
To establish the opposite inequality, let « be a real number, 0 < a <1, and let g 


be a simple function with O<g </f. Let S, = {x; xe δ, f,(x) > ag(x)} so that 
S,€A, S, Ξ S,4, and S = US,. According to (7.5.9) of Lemma 7.5.4, 


| ag du <| Indu < Vf, du. (7.5.10) 
Sn Sn 
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Since the sequence {S,} is monotone increasing and has union S, it follows from 
Lemmas 7.5.3 and 4.1.2 that 


fgdu=lim] gdp. 
n>odJdSn 


Therefore, letting ἢ -- oo in (7.5.10) we obtain 


ας ἀμ < lim ff, ἀμ. 


no 


Since this holds for arbitrary « with 0 < a <1, it follows that 


fg du < lim Jf, dp, 


n>o 


and since g is an arbitrary simple function on S such that 0 < g < f, we have 
[ἀμ = sup [9 du < < lim [2 du. 


Combining this with the opposite inequality, we obtain the desired result. ἥ 


A careful study of the proof of Theorem 4.3.6 reveals that the use of Lebesgue 
measure is immaterial. The same process of construction as used in Theorem 4.3.6 
gives 


Theorem 7.5.2. Let f be a non-negative extended real-valued A-measurable 
function on S. Then there exists a monotone increasing sequence { f,\ of non- 
negative simple functions on S such that { f,\ converges to f. 


As direct consequences of Theorem 7.5.2 and/or the Monotone Convergence 
Theorem, the following corollaries can be verified easily. 


Corollary 1 (Fatou’s Lemma). Let { f,} be a sequence of non-negative extended 
real-valued A-measurable functions on S, then 


J dim inf f,) du < < lim n inf { f, du. (7.5.11) 
(See Problem 6 in Exercise 4.3.) 


Corollary 2. Given a real number a > 0. If f is a non-negative extended real- 
valued A-measurable function on S, then so is af and 


[αῇάμ = af fdu. (7.5.12) 
(See Corollary 2 to Theorem 4.3.6.) 


Corollary 3. If f and g are non-negative extended real-valued A-measurable 
functions on S, then so is f + g and 


f+ 9) dp = [7ἀμ + fg du. (7.5.13) 
(See Corollary 3 to Theorem 4.3.6.) 
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With respect to μ, we shall now discuss the integration of extended real-valued 
A-measurable functions which are not of one sign. Many definitions and proofs of 
Chapter 4 depend only on properties of Lebesgue measure irue for an arbitrary 
measure in an abstract measure space and carry over to this case. 


Definition 7.5.4. An extended real-valued A-measurable function f on S is 
integrable over S iff the positive and negative parts f* , f ~ of f have finite integrals 
with respect to μ. In this case, the integral of f with respect to μ 15 defined to be 


ffdu = Sf* ἂμ -- [7 du. (7.5.14) 
(See Definition 4.3.3.) 


We notice that the remark made after Definition 4.3.3 also applies to this 
extended definition. 
The Principle of Absolute Integrability takes the following form. 


Theorem 7.5.3. A real-valued A-measurable function f is integrable with respect 
to wp over S iff | f\ is integrable over S. In this case 


ful < Sif \ du. (7.5.15) 
(See Theorem 4.3.7.) 


The linearity of integration with respect to μ᾿ should become clear through the 
following theorem, whose proof is almost identical with that of Theorem 4.3.9. 


Theorem 7.5.4. If « is a real number and the real-valued functions f and g are 
integrable over S, then so are af and f+ g. Furthermore, 


ff) du =affdu, (7.5.16) 
(f+ 9)du=JSfdut+ [6 ἀμ. (7.5.17) 


The following is probably the most important convergence theorem for 
integrable functions. 


Theorem 7.5.5 (Dominated Convergence Theorem for integrals with respect to 
u). Let g be integrable over 5, and suppose that {f,' is a sequence of 
A-measurable functions such that on S we have | f,(x)| < g(x) for all n, and such 
that { f,} converges p-almost everywhere to an A-measurable function f. Then 
f is integrable and 

{fdu = lim [f, du. (7.5.18) 


Proof. Apply Fatou’s Lemma to the sequences {g + f,} and {g —/,}. J 


The following will be needed in introducing the Bochner Integral. 
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Lemma 7.5.5. Suppose that for each n, f, is an extended real-valued function 
integrable with respect to 4 over S. Suppose, further, that 


Σ Sifldu < 00 (7.5.19) 


Then the series δος, f, converges pt-a.e. on S to areal-valued integrable function f. 
Moreover, 


ffdu = y ff, ἀμ. (7.5.20) 


(See Problem 9, Exercise 4.3.) 


Let (S, A, μὴ) be an abstract measure space. Let X be a B-space. One type of 
integral assigned to certain vector-valued functions x(s) on S into X was developed 
by S. Bochner (1933) and may be described as follows. One begins with the simple 
functions x(t) identifying any pair which differ only on a set of measure zero. The 
classes of such functions then form a normed linear space with norm 
xO = ) |x(t)|| du. Defining the vector-valued integral 


{ x(t) du 
in the obvious way, it is clear that 


If x(t) dull < [χ Ὁ ]5-: (7.5.21) 
If one now completes the space, one can extend the integral to all Cauchy sequences 


and obtain the Bochner integral. More precisely, let us start with 


Definition 7.5.5. Each vector-valued function x(s) on S into ¥ which assumes 
only a finite number of distinct values, each value #0 on a p-measurable set of 
finite measure, is called a simple function. 


Obviously there is a uniquely determined standard representation for each 
simple function. See (7.5.1). 

Observe that in the previous discussion a function of the form (7.5.1) is called 
a simple function even though the condition p(S,;) < oo for j =1, 2, ..., m is not 
necessarily satisfied. Therefore we stress the fact that in the present discussion we 
adhere to the narrower interpretation where this condition is satisfied. 


Definition 7.5.6. If g(s) is a simple function on S with the standard 
representation 


g(s) = 2 9; Xs;> (7.5.22) 
iz 
the vector-valued integral of g(s) with respect to μι is defined to be the vector 
Jg(s) dp = X 9 H(S)). (7.5.23) 
= 


(See Definition 7.5.2.) 
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It is clear that this vector-valued integral is linear on the vector space of all 
simple functions. Finally, (7.5.21) holds. For the proof note that if g(s) is given by 
(7.5.22) with disjoint sets S,, then 


[ᾳ(5}}} = ᾽Σ lg, Xs; VseS. 


Hence 


|σ(6) dul] = | Σ 9.μ 


« > los 2(S;) = [196 )}μ = [40}- 


Theorem 7.5.6. If a sequence {x,(t)} of vector-valued simple functions on 5 
satisfies 


lim  |Ix,(t) — x,,(t)l| du = 0 (7.5.24) 


as m, n— ©, then there exists a unique vector-valued function x(t) such that 
|x(t)|| and all functions ||x(t) — x,(t)|| are A-measurable and 


lim f |[x(t) — x,(t)|| du = 0. (7.5.25) 


no 


Uniqueness here is understood in the sense that functions differing only on a 
set of u-measure zero are identified. 


Proof. The argument is similar to that used to prove that the space 11 of all 
numerical p-integrable functions is a complete metric space. Given now the 
sequence {x,(t)} of simple functions, satisfying (7.5.24). This implies the existence 
of a subsequence {x,, (¢)} satisfying 


> Ϊ ΙΧ, ..(7) a x, (0) du < 0. (7.5.26) 


See (4.4.21). Then, by Lemma 7.5.5, 


XO + Y Paces — Xu 


is u-integrable and hence finite for almost every te δ. Since in a B-space an 
absolutely convergent series is convergent (why?), the vector-valued series 


ult) ἘΣ [χρη το Xa (0 


converges for the same values of 1; let x(t) be its sum where it converges, i.e. μ-ἃ.6., 
and let x(t) = 0 elsewhere. Since x(t) = hao X,,(¢) p-a.e., we have ||x(t)|| = 


{ππ |x,,(¢)|| and |[x(¢) -- χ,(Π)}} = a Ix,() - X,(t)|| w-a.e. Thus ||x(¢)|] and 


Ix) — x,(t)|| are A-measurable. it Hollows from 


x(t) — x,,(t) = Σ [Xp,.,(t) — χ Ὁ] 
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that 
lim J ||x(t) — x,,(t)|| du = 0 
as p > oo and, hence, | 
JIx@ -- x, ἂμ < J ilx) — x, Il ἀμ + fix@ -- x,,@I| du. 


As n and p — οὐ, both terms on the right go to 0, so (7.5.25) holds. 
Assume that 


J lly(t) — x, (0) || du > 0 | 
for another vector-valued y(t). Then, in particular, this holds for n = n,, where 
{x,,,(¢)} is the subsequence used above. But the 11 convergence of |ly(t) — X,,(¢) | 
to zero implies the pointwise convergence almost everywhere. The sequence 
{X,,(t)} converges then pointwise to x(t) and to y(t) simultaneously, so 
y(t) = x(t) y-a.e. This shows that x(t) is uniquely determined. 
Furthermore, 
Lf x,(2) ἐμ "" [χ (ἡ ἀμ] = ll [x,(¢) τ Xn(t)] d|| 
< f \ix,(t) — x, (t)||du>0 as mn> o. 
Thus the Cauchy sequence {jf x,(t) du} converges in 3. Suppose that {x,(t)} and 
{y,(t)} are two sequences of simple functions such that 
lim J ||x(t) — x, (‘| du =0, {πὶ f ||x(¢) — γ,(ἢ)}}} du = 0. 


Then 
|| χ (ἢ du — fy,(t) dull < f ilx,(t) — y,(¢)ll du 
< J ix) — x, (|| du + J iix@ -- y, (0) du. 


This shows that the limit element of {{x,(t) du} is uniquely determined by x(t). 
We use J x(t) du to denote lim {x,(t) du and 

[χ( du (7.5.27) 
is called the Bochner integral of x(t) with respect to μ. Ε 


We shall denote the collection of all x(t) for which the Bochner integral is thus 
defined by B* = B(S, A, μ; 3). It is evident that 8! is a complex vector space, and 
Bochner integration is linear on 8*. Any x(t) ε 8! is called a Bochner integrable 
function. In particular, if ¥ = C, then 8 = L’(S, A, μ) is the space of all complex- 
valued p-integrable functions. 


Theorem 7.5.7. If x(t) is Bochner integrable, then 
IJ χ( dull < J xl] ἀμ. (7.5.28) 
Proof. Let {x,(t)} be such a sequence of simple functions that 
lim J ix) — x,(2)|| du = 0. (7.5.29) 
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Then 
{Πχ(Ὁ}} du = lim J ||x,(2) |] dy. (7.5.30) 


Since we also have 
Nf χ, (ἢ dull < {|χ,(0}}} du (7.5.31) 


the last three inequalities imply (7.5.28). Jj 


Theorem 7.5.8. The complex vector space 8* of all Bochner integrable 
functions x(t) is a B-space with respect to the norm § ||x(t)|| du. 


Proof. It is clear that this expression is a norm on %'. Let us prove completeness 
under the norm. Let {y,(t)} be a Cauchy sequence in $+ such that 


filyn(t) — Ym(t)Il du +0 as m,n oo. 


By the definition of Bochner integrability there exists for each y,(¢) a simple function 
x, (t) such that 

Sly. — χ, ἢ} du < 27". 
By the triangle inequality 


J x(t) — Xp(t)|| du 0 as m,n oo. 
Hence, by Theorem 7.5.6, there exists a function x(t) ε 8° with 


lim f |Ix() — x,(¢)|| ἀμ = 0. 
Once more, by the triangle inequality, we have 
lim { ᾿χ() — y,()] du = 0, 


n> oo 


and completeness follows. ἢ 


Theorem 7.5.9 (Dominated Convergence Theorem for Bochner Integrals). 
If y,(t) ε B' and |\y,(t)|| < g(t) for n = 1, 2, ..., where g(t) is u-integrable over — 
S, and if y,(t) converges to x(t) μ-α.6., then x(t) € B* and. 


lim { x(t) — y,(4)I| du = 0. (7.5.32) 
In particular, 
{ x(t) du = lim J y,(t) du. (7.5.33) 


Proof. From 
y(t) — x(t) = lim [y,(t) — yn(t)] μ-ἃ.6. 


mo 


it follows that _ 
llyn(¢) -- x()l] = lim |ly,@) — Ym(@)I  μμ-ἃ.6., 


mo 
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so that the left member is A-measurable. Moreover, 
lly.(¢) — x@)|l <2g9(¢) and un lly.(¢) — x(¢)| =O μ-8.6. on δ. 
Hence, by the dominated convergence theorem for numerical functions, 
lim {Πγ,(0) — χα} du = 0. 
Thus by the preceding theorem, x(t)¢8,. Furthermore, 


Jx(2) du = lim fy,(¢) dy 
since 
IPx@ du — Jy, (¢) dull < J ix@) —y,@Il du. ΗΚὶ 


In Chapter 4 we have seen that the Lebesgue measurable functions and the 
Lebesgue integrable functions are closely related. This relation extends to 
A-measurable functions and p-integrable functions when integration on an abstract 
measure space (δ, A, μὴ) is treated. In the process of abstraction, Bochner’s 
integral yields X-valued integration on an abstract measure space (δ, A, μ). It is 
natural to ask: How is Bochner integrability characterized in terms of some 
measurability defined for functions from S to ¥ a B-space? The first step in this 
direction is given by 


Definition 7.5.7. Let X be a B-space and (S, A, μὴ) a measure space. A function 
x(t) from S to 3 is called weakly A-measurable iff for any x* € ¥*, the numerical 
function x*[x(t)] of t is A-measurable. x(t) is strongly A-measurable if there 
exists a sequence of simple functions strongly convergent to x(t) u-a.e. on δ. 


Definition 7.5.8. x(t) is said to be separably valued if its range x(S) = {x(t); 
ΓΕ S} is separable. x(t) is called y-almost separably valued if there exists an 
A-measurable set Sy of u-measure zero such that x(S © So) is separable. 


The two notions of measurability for ¥-valued functions are connected by the 
following theorem due to B. J. Pettis (1938). 


Theorem 7.5.10. x(t) is strongly A-measurable iff it is weakly A-measurable 
and pt-almost separably valued. 


Proof. Suppose that x(t) is strongly A-measurable, then there exists a sequence of 
simple functions {x,(t)} such that 


lim x,(t) = x(t) 


except on a set ΕΑ of u-measure zero. Clearly, each vector-valued simple 
function x,(t) is weakly A-measurable. This implies that, for each x* € X*, the 
numerical function 


x*[x(t)] = lim x*[x,(¢)] 
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is A-measurable. Therefore, x(t) is weakly A-measurable. Furthermore, the union 
of the ranges of x,(t) (n = 1, 2, ...) is a countable set, and the closure of this set is 
separable. Being a subset of this separable set, x(S © So) is therefore separable. 

In proving the converse proposition, without losing the generality, the range 
x(S) may be assumed separable. Hence the space X itself may be assumed 
separable; otherwise simply replace ¥ by the smallest closed subspace generated 
by x(S). 

For any real number a, consider sets 


A = {t; ||x(t)|| < a} and A(x*) = {¢; |x*[x(2)]| < α) 


where x* e X¥*. Clearly, 
A & () {A(x*); ||x*]] < 1}. (7.5.34) 


By a direct consequence of the Hahn-Banach theorem (see Section 10.3 and 
Theorem 2.3.2(1)), for a fixed ¢, there exists an x)* € X* with ||xo*|| =1 and 
Xo*[x(t)] = ||x(¢)||. This implies that we can invert the order of inclusion in 
(7.5.34). Thus the inclusion should be replaced by equality. 

In order to show that A can be reduced to the intersection of countably many 
sets A(x*), we apply the following lemma to be proved by the reader (see Problem 
5, Exercise 7.5). 


Lemma 7.5.6. Let X be a separable B-space, then there exists a sequence 
{x,*} < X* with ||x,*|| <1 such that, for any xo*¢X* with ||xo*|| <1, a 
subsequence {x,,*} can be so chosen that 


lim x,,*(x) = Xo(x), VxeX. ΝΣ (7.5.35) 
ko 


By the lemma we have now 
A = (\[AG*); le“ <1] = ἢ AG). 
j= 


From this and the weak A-measurability of x(t) it follows that ||x(¢)|| 15 
A-measurable. 

For any positive integer n, the range {x(t); t ¢ S} may be covered by a count- 
able number of open spheres 5;,, (7) =1,2,...) of radius <1/n, since the range is 
separable by assumption. Let x, ,, denote the center of the sphere S,,,. As proved 
above, ||x(t) — x,,,|| is A-measurable in ¢. Hence the set 


T,, = {t; te S, x(t) eS, 4} 
is A-measurable and S = U3, T;,,. Construct x,(t) by setting 
i—1 
x,(t) =X,, if te {Ti Θ U a 


Since δ = U7, (Τρ, © }π|17;,.}. it follows that ||x(t) — x,(¢)|| <1/n for Every 
te 5. Furthermore, each x,(¢) is strongly A-measurable because ΤΊ, © {)}Ξ Tn 
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is A-measurable. x(t) is the strong limit of the sequence {x,(t)} and is thus 
A-measurable. 


Theorem 7.5.11. A strongly A-measurable function x(t) is Bochner p-integrable 
iff \|x(t)|| is u-integrable. 


Proof. The necessity has been shown in Theorem 7.5.7. To prove the sufficiency, 


let {x,(t)} be a sequence of simple functions strongly convergent to x(t) yl-a.e. on S. 
Let 


1 
y(t) =x,(t) if Isl < Ix@I(1 +=), 


= 0 if [xl > Ix@i(1+—). 


Then the sequence of simple functions {y,(t)} satisfies 


] 
ivs(O < Ix@l (1 + —) «2κωὶ 


and 
lim ||x(t)—y,(t)| =90 pra. 


Thus, by the p-integrability of ||x(¢)|| and by the Dominated Convergence Theorem 
for the Bochner Integral, x(t) is Bochner p-integrable. ἥ 


Corollary. Let T be a bounded linear operator on a B-space X into a B-space ἢ, 
T € ©(X, Y). If x(t) is an X-valued Bochner p-integrable function, then T[x(t)] 
is a Y-valued Bochner p-integrable function, moreover 


{ TEx(t)] du = TE x(t) du]. (7.5.36) 


Proof. Let a sequence of simple functions {y,(t)} satisfy 


I 
Ἰνν(}} <Ix@)| (1+—) and tim jy.) - x =0 wae. 
It follows from the linearity and continuity of T that 


) TLy,(2)] du = ΤΙ y,(¢) ἀμ]. 


Furthermore, by the continuity of T, 


ITLy, Ml < ITH lyn < 20TH Ix@, 
lim ||TLy,(t)] -- TLx@)]|| =90 μ-8.6. 
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Thus T[x(t)] is Bochner p-integrable and 
J TEx(t)] du = lim f TLy,(¢)] du 


lim TLfy,(¢) du] = TE x) dy], 


n-> co 


lle 


as asserted. 


EXERCISE 7.5 


1. Let (8, A, μὴ be an abstract measure space. Let F be the class of all real-valued 
A-measurable functions defined on S. Define what should be meant by saying that 
an element f of F is quadratically y-integrable over S. State some properties of such 
functions. 


2. With the measure space as in Problem 1, let X be a complex B-space and let F be the 
class of all X-valued strongly A-measurable functions defined on S. Define what is 
to be meant by saying that an element f of F is quadratically p-integrable. 

3. Let B? = B7(S, A, μ; 3) be the subset of F for which [f ||f(s)|}? du]? is finite. 
Prove that this expression is a norm for the space 32 and that B? is complete under 
this norm. 


4. Define B? = B'S, A, μ; X) for any p, 1 < p < ow. Define also B”. 

5. Prove Lemma 7.5.6. [Hint: Let ‘Kes w= 1272; ae be a countable set dense in X. 
Consider a mapping x* — f,(x*) = [x*(x,), x*(X2), ..., x*(x,)] of the unit sphere 
of ¥* into the n-dim Hilbert space /7(n)]. 
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8 COMPLEX ANALYSIS IN LINEAR SPACES 


In this chapter we shall be concerned with analytical mappings. We start with 
analytical functions on complex numbers to vectors. Here it will be shown that a 
substantial portion of the elements of complex function theory extends to B-spaces. 
Analytic functions of infinitely many variables were considered by D. Hilbert (1909) 
and F. Riesz (1913). The basic extensions to abstract spaces were given by Norbert 
Wiener in 1923, but much was added from different points of view and along 
various lines by Nelson Dunford, L. Fantappi¢é (mainly analytic functionals), 
I. Gelfand, and A. E. Taylor in the 1930’s. 

There is also a theory of analyticity in commutative B-algebras from the 
algebra into itself due to E. R. Lorch (1943). There is a much older theory of 
analytic functions on vectors to vectors started by Maurice Fréchet in 1909 and 
developed under weaker assumptions by R. Gateaux (published in 1919-22). 

In the present chapter we shall restrict ourselves, on the one hand, to the 
generalization of the Cauchy theory to B-spaces, and, on the other, to the elements 
of the Fréchet theory. 

There are four sections: Abstract holomorphic functions; Theorem of Vitali; 
Functions analytic in the sense of Fréchet; and Some properties of (F)-analytic 
functions. 


8.1 ABSTRACT HOLOMORPHIC FUNCTIONS 


We consider two B-spaces ¥ and Ἢ over the complex field, which may coincide, and 
their corresponding duals, ¥* and 4)". Let s be a complex variable, D a domain in 
the complex s-plane, and x = f(s) a mapping from ἢ into X. This is a vector- 
valued function. We shall also have occasion to consider operator-valued functions. 
Let T(s) € &(X, Y) be defined for s in ἢ. The basic question is: What is to be 
understood by saying that f(s) or Τί) is holomorphic in D? 

In the Cauchy theory f(s) is holomorphic in D if the difference quotient has a 
limit everywhere in D. This means that there exists a complex-valued function 
f(s) such that 


m Τα + ἢ - 70} —f'(s)} ΞΞ Ο. (8.1.1) 


5-.0 
249 
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Let Γ be a simple closed rectifiable oriented curve in D which, together with its 
interior, is a subset of ἢ. Then it is known that for s interior to I’ we have the 
Cauchy integral representation 
| ] f(t) αἱ 
s)=— | —. 8.1.2 
I ) 2πὶ Jr ἔ -- 5 ( ) 
Here the integral is a Riemann-Stieltjes integral for the line of integration is a 
rectifiable curve, i.e. it admits of a representation in the form t = (σ)ὺ, ὁ «σάω 
in terms of the.arclength o. The integral is then shorthand notation for 


1 fe f[x(o)] 


re Mees wees (8.1.3) 


The integral exists since the integrand is a continuous function of a. We shall see 
that these formulas extend to the abstract case. 

Formula (8.1.1) states that the difference quotient tends to the derivative every- 
where in D. Actually the convergence holds uniformly on compact subsets of D. 
This fact plays a basic role for the extensions, so we shall state and prove it in a form 
that will meet our needs. 


Theorem 8.1.1. For any function f(s) holomorphic in the sense of Cauchy in 
the simply connected domain D and for any compact subset S of D there is a finite 
quantity M(f;S) such that for every s,s +hands+kinS 


Ϊ ] I 
ΕΞ π [765 τ ἢὴ --ὐῸ Ἢ} - ᾿ πι 5 ἘΔ) -s)}} < M(f;S). (8.1.4) 
This implies that the difference quotient tends uniformly to its limit in S. 


Proof. Let Γ bea simple closed rectifiable oriented curve in D which contains S in 
its interior and which has a positive distance both from S and from the boundary of 
D. Anelementary calculation based on (8.1.2) shows that the expression inside the 
absolute value sign in the left member of (8.1.4) is represented by the Cauchy 
integral 


[ emery CK (8.1.5) 


ani Ie (t—s)\(t-s—hA)(t—-s—k) 


Here f(t) is bounded on Γ and the denominator is bounded away from zero 
uniformly in 5, 5 + h, ands + k, and Γ is of finite length. This shows the existence 
of an M(f;S) with the stated properties. 

Actually (8.1.4) is a stronger assertion than what we need for uniform con- 
vergence. The latter is obtained as follows. We first let k > 0, obtaining after 
multiplying the result by A, 


1 
ZB 9) 66 ἘΠ 570} — f°(8)| Ξ |A| MY; 5) (8.1.6) 
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from which the uniform convergence of the difference quotient to the derivative 
follows upon letting ἢ > 0. §f 


Here the restriction to a simply connected domain is superfluous. There are 
classical extensions of Cauchy’s integral to multiply connected domains which may 
be used in (8.1.5). 

We now proceed to definitions of holomorphic vector and operator functions. 
In analogy with the situation for continuity we would expect weakly and strongly 
holomorphic vector functions and three varieties of holomorphic operator functions. 
Actually there is only one of each kind: the weakest property implies the strongest. 


Definition 8.1.1. With the notation as above, we say that f(s) and T(s) are 
holomorphic in D if x*(f](s) and y*{T(s)[x]} are holomorphic in D in the sense 
of Cauchy for every choice of x EX, x* € X*, and y* ε Y*. 


This is weaker than weak differentiability. Nevertheless, we shall see that the 
strongest form of differentiability follows. This is an elegant application of the 
uniform boundedness theorem. 


Theorem 8.1.2. (1) If f(s) is holomorphic in ἢ, then £(s) is strongly continuous 
and strongly differentiable in D, uniformly with respect to s in any compact subset 
of Ὁ. (2) If T(s) is holomorphic in D, then T(s) is continuous and differentiable in 
the uniform operator topology for s in D, uniformly with respect to s in any 
compact subset of D. 


Proof. We shall give the proof for part (2), which involves two applications of the 
uniform boundedness principle. To simplify the notation we write 


l I ] 

—— |—[T h)—T — —[T k) —- T = 
FLT τ ἢ - ΤΟῊ -- [IT +  -- ΤΟῊ] = ΤΟ 4k) 

ΒΥ Theorem 8.1.1 the assumptions in case (2) imply for the complex-valued 
function y*{[T(s)[x]} the inequality 


Iy*{ T(s; h, k)[x]}}| < M(y*, x, T; δ) (8.1.7) 


for every choice of s,s + h,s +k in S. Here, as before, S is any chosen compact 
subset of D. By Theorem 7.2.3 this implies the existence of a finite M(x, T; S) 
such that 


| T(s; h, K)Lx] | < M(x, T; S). (8.1.8) 


Here we can apply Theorem 7.1.5, which in this case asserts the existence of a finite 
M(T; S) such that 
| T(s;h, k)|| < MCT; S). (8.1.9) 


From now on we can proceed as in the proof of Theorem 8.1.1. We can let 
h, k > Oand use completeness of &(X, Y). If we multiply Τῷ; ἢ, k) by h — k before 
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passing to.the limit, we see that the difference quotient has a limit T’(s) ε €(X, 9). 
For k — 0 we obtain 


< |h| MCT; 5) (8.1.10) 


Ϊ 
(76 +h) — ΤΟῊ -- Ts) 


for all s ands +hin 5. Thus the difference quotient tends uniformly to its limit, 
the derivative, on compact subsets of D just as in the complex-valued case. This, of 
course, implies continuity of T(s) in the uniform operator topology. We have also 
differentiability in the strong and the weak topologies. 

The proof of part (1) is left to the reader. We shall refer to f(s) as 
X-holomorphic since it is by assumption a holomorphic function with values in the 
B-space X, with similar terminology in other cases. i 


We have generalized the first approach to the notion of X-holomorphic of func- 
tions via differentiability of functionals. The same device leads to abstract Cauchy 
integrals, more generally integrals of the Cauchy type. Our point of departure is 
Theorem 7.4.2, integrals of the first kind 


| ἴω dg(t). (8.1.11) 


Here f(¢) is strongly continuous for ¢ in [a, b] and g(t) is numerically valued and of 
bounded variation. The variable ¢ is real in this formula, while we want to define 
integrals of the form 


Ϊ {(t) dt (8.1.12) 
Γ 


taken along ἃ rectifiable oriented arc in the complex f-plane, where f(t) is strongly 
continuous for ton I. The reduction to type (8.1.11) is immediate. Again, using 
arclength as parameter, we have a representation of Γ by 


t = t(o), 0<coK<a, (8.1.13) 


where f(a) is a continuous function of bounded variation, ὦ is the total length of I. 
We have then 


= | ἬΝ dt(o), (8.1.14) 
0 


which is of type (8.1.11) since f[t(a)] is strongly continuous in o and has values in X. 
This is the sense to be attached to (8.1.12). Once we have made this convention we 
shall feel free to use (8.1.12) rather than the more elaborate formula (8.1.14). 

The integral (8.1.12) has properties which again are analogous to those of the 
classical Riemann-Stieltjes integral. Thus the passage from f to J[f] is a linear 
operation: 


| [a,f,(t) + «,f,(t)] dt = a, | {,(t) dt + a, ] f,(t) dt. (8.1.15) 
r Γ Γ 
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It is also a bounded operation 


Ι f(t) dt 


which is the natural generalization of the classical estimate. The integral is also an 
additive set function of the path of integration. If T is the union, suitably oriented, of 
two oriented subarcs I’, and I’, which have only endpoints in common, then 


< max ||f(z)|| ὦ, (8.1.16) 
tel 


[τ dt = " [(1) dt + [τὸ dt. (8.1.17) 


We recall that bounded linear operators may be taken under the sign of 
integration so that 


| | $0) ae =| ria dt, Te€(X,9). (8.1.18) 


In particular, this holds for T = x* ε X* 


x | [τὸ dt = [ 001 dt. (8.1.19) 


So far f was merely strongly continuous for ton Γ. For X-holomorphic func- 
tions we can prove Cauchy’s theorem. 


Theorem 8.1.3. Suppose that {(s) is X-holomorphic in the sense of Definition 
8.1.1 in a domain D of the complex plane. Suppose that T is a simple closed 
rectifiable oriented curve in D such that f(s) is holomorphic inside and on TY. 
Then 


| {(t) dt = 0. (8.1.20) 
Γ 
Remark. A curve Γ = {(σ), 0 < σ < o, is a simple closed curve iff 

t(o,) = t(o,),0, <o, implies σι = 0,0, =o. (8.1.21) 


For such a curve the Jordan curve theorem assigns a definite meaning to the phrase 
“inside of I’. 


Proof. Apply (8.1.19) to the left member of (8.1.20). Then 


| [#0 a | = [ΠΩ] dt 


for every x* € X*. Since x*[f(t)] is a complex-valued function holomorphic in the 
sense of Cauchy inside and on I, by the classical Cauchy theorem its integral along 
I is zero. Thus the integral in (8.1.20) is an element of ¥ which annihilates all 
functionals in X*. Since the zero element of X is the only one with this property, 
(8.1.20) holds. jj 


The same type of argument leads to Cauchy’s integral. 
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Theorem 8.1.4. If f(s) is X-holomorphic inside and on the simple closed 
rectifiable oriented curve T, then for s inside T 


f(s) = ΒΕ. {(t) dt 


ni Jr t—s 


(8.1.22) 


Proof. Under the stated assumptions Cauchy’s formula is valid for each of the 
complex-valid functions x*[f(s)], x* ε X*, i.e. 


] — dt Ϊ { f(t) | 
* oe i ae case ee 
a 2ni ἐ -- 5 {= rt—s}’ 
so that 
] f(t) d 
0) - aad =f αὐὐε ἢ 
ni ri-s 


Thus (8.1.22) is valid. Jj 
Theorem 8.1.5. An X-holomorphic function f(s) has derivatives of all orders and 


£5) = 2 | “2 7a Oe en (8.1.23) 


2ni (( -- “)}"}} 


Proof. If f(s) is X-holomorphic, then it has a derivative Γ΄ (6), and x*[f’(s)] is 
holomorphic in the sense of Cauchy, x* € X*. This, by Definition 8.1.1, makes 
{'(s) X-holomorphic and ensures the existence of f’’(s). Since x*[f'’(s)] is Cauchy 
holomorphic, f’’(s) is X-holomorphic, and so on. Thus the existence of all 
derivatives is ensured and each f(s) is ¥-holomorphic in the domain D where f(s) 
is supposed to have this property. Further, the representation (8.1.23) must hold, 
for both sides are meaningful as elements in X and their difference annihilates every 
functional χε X*, and this requires that the difference is zero. a 


Once we have Cauchy’s integral at our disposal for X-holomorphic functions, 
then one of the main tools for developing a function theory for X-analytic functions 
is available. We shall give a few samples of results obtainable in this manner. We 
start with the estimates of Cauchy. 


Theorem 8.1.6. If £(s) is X-holomorphic in the closed disk [1 — so| <r and if 
max || f(s) + re'®)|| = M, then 
6 


{Ὁ 59}} «- Mr-"n!. (8.1.24) 


Proof. In (8.1.23) set s = 80» t = Sy + γε, take Γ as the circumference of the 
disk and apply (8.1.16). Η 


The Cauchy estimates lead immediately to Taylor expansions for 
X-holomorphic functions. 
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Theorem 8.1.7. If {(s) is X-holomorphic in a domain D, if 80 is a point in ἢ at a 
distance R from the boundary OD of D, then 


00 


[6) - ¥ ΕΝ (5 -- So)", (8.1.25) 


n=0 
where the series is absolutely convergent for [5 — So| < R. 


Proof. That the series is absolutely convergent for [5 — S| < R follows from 
(8.1.24). Here f(s) is ¥-holomorphic in the disk and for each x* € X* we have 


00 


1 
<*[£(S)] = x¥LF(50)] (5 — 50)" 


n=0 
in the same disk. By the usual argument this requires that (8.1.25) holds. ἢ 


In a similar manner we can prove Laurent expansions for our functions 


f(s) = Σ᾽ a(s— 50)", a, € , (8.1.26) 
valid in any annulus 0 < R, < |s — δο « R, < ©, where f(s) is X-holomorphic. 
Here the coefficients a, are given by analogues of the classical integral formulas. 
In particular, if R, = 0 there is an isolated singularity of f(s) at s = 50. namely a 
pole of order mifa_, = Ofork > mbuta_,, 4 0. Ifno such m exists, we are dealing 
with an essential singular point. 

We have also the uniqueness or identity theorem the proof of which is left to the 
reader. 


Theorem 8.1.8. If £{(s) and g(s) are X-holomorphic in D and if a sequence {s,' 
exists with distinct points and at least one cluster point in D such that 
f(s,) = g(s,), Vn, then f(s) = g(s). 


Let us return to formula (8.1.23). Here we set s = 50; f = 59 + re’ and obtain 
γ" 1 [2π τῆς τες τῇ 
-ἰ)(ς0) = - f(so + re’) ε΄ "dé. (8.1.27) 
n! 2x Jo 


This states that the left member is the nth trigonometric Fourier coefficient of 
f(s. + re’) considered as a function of 0. We observe that all Fourier coefficients 
with negative subscript are zero. The case n = 0 is particularly interesting. Here 


| ae Gece 
f(so) =— | f(s + re'’) dé. (8.1.28) 
2π 0 
This shows that f(s) has the mean value property which pertains to Cauchy holo- 


morphic functions as well as to harmonic functions. The formula has an important 
consequence, namely 


1 2π ᾿ 
ΠΟ] «τ Ι If(s + re®)|| a. (8.1.29) 
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This inequality shows that ||f(s)|| is a so-called subharmonic function in D. Such a 
function has the important property of having no local maximum in D unless it is 
identically constant. We shall give a direct proof of this property rather than appeal 
to the theory of subharmonic functions which cannot be supposed to be familiar to 
the reader. The Principle of the Maximum now reads: 


Theorem 8.1.9. If £(s) is X-holomorphic inside and on the simple closed 
rectifiable curve T and if 
max ||f(t)|| = M, (8.1.30) 
tel 


then for every s inside T 
|f(s)|| < M (8.1.31) 


with equality for one such s iff there is equality for all. 


Proof. Again we use the analyticity of the functionals when applied to f(s). Here, 
x*[f(s)] is Cauchy holomorphic inside and on I, and by the principle of the 
maximum for such functions 


Ix*[f(s)]] < ty [x*[f(t)]| = Μά: x*) 


with equality iff x*[f(s)] is a constant of absolute value M(f;x*). Here 
M(f; x*) < M||x*||. On the other hand, 


f(s) = sup |x*[f(s)]| < sup M(f;x*)<M 
[lx*I|=1 |ix*I|=4 


so that (8.1.31) holds everywhere inside and on IT. 
Suppose now that there is a point 80 inside Γ where ||f(s,)|| = M. We then 
appeal to (8.1.29) and obtain 


1:3: ’ 
M = [fG@o)ll < τς | lf(so + re')|| db, (8.1.32) 
0 


where the integrand is everywhere <M andr < d(so, IT). But a function with such 
an upper bound cannot have an average value > M unless it equals M everywhere. 
Hence ||f(s)|| = M for s anywhere in the disk |t — δοί < d(so, I). 

To extend this to the rest of the interior of I, we argue as follows (essentially 
an analytic continuation argument). If s, is a point inside I but outside the disk 
just mentioned, we can draw a polygonal line joining 5, to s, with vertices at 


lo = Sq; tis f>, seg t, = 81 
such that for each j 
[74 1 — ἢ} « α(;- Γ), j=9, 1, οἷός ea ΤῸ 


In the first disk, [5 — 9 < d(to, I), the identity {{{6}} = Μ holds. In particular 
it holds at s = t,. We can then apply (8.1.32) again, replacing 80 by ¢,, and find that 
\|f(s)|| = M is true also in the second disk [5 — ¢,| < d(t,, I) and so on. Since s; 
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is arbitrary, the identity holds everywhere inside I. It also holds on T by the 
continuity of the norm. | 3 


In Theorem 8.1.9 X-holomorphism of f(s) on Γ is more than enough; con- 
tinuity on and inside Γ plus ¥-holomorphism inside Γ is sufficient for the desired 
conclusion. | | 

The conclusion in the case that there is a point s = s, inside Γ᾽ where 
||f(s9)|| = M differs from that in the classical case where we simply have f(s) equal 
to a constant of absolute value M. No sharper conclusion is possible in the X- 
valued case, however. This may be concluded from the situation in the space M, 
with norm | 

|| Al] = max ({a;,| + |ajal). 
J 


Here we take the matrix 
1 0 
A(s) = = δι. + δ) 5. (8.1.33) 
0 s 


This is a polynomial in s and certainly not a constant. Since 
| 4(s) | = max (1, [5]) 


it is seen that ||A(s)|| =1 for [5] <1. 


EXERCISE 8.1 


1. Verify (8.1.5). 
2. Verify (8.1.15) to (8.1.17). 


3. Show that the series (8.1.25) converges absolutely for |s — 50] « R. Why is the sum 
of the series an X-holomorphic function of s in this disk? 


4. Give a representation of the coefficients a, of formula (8.1.26). 


5. [Theorem of Liouville.] If f(s) is ¥-holomorphic throughout the finite plane and if 
{γε} < Mr*, 0 <0 < 22, 1<r, M and α fixed, 0 < a, show that f is a 
polynomial in s of degree <a. [What is a polynomial of degree «Ὁ 


6. If f is X-holomorphic in 0 < |s — δορὶ < Καὶ and if |f(s) + re™)|| < Mr7%, 
0<6@ < 27,0 <r « min(I, R), ἃ fixed, 0 < α, show that s = sy is a pole of f of 
order <a. What conclusion may be drawn if the inequality holds for « = 0? 


7. [Schwarz’s Lemma.] If f is X-holomorphic for [5] <1, f(0) = 0 and ||f(s)|| < ™, 
show that ||f(s)|| < M|s|, |s| <1. What is the condition for equality at some point 
S = 50. Sg # 0? 
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oo 


. Referring to formula (8.1.33) for notation, we want to construct a linear bounded 
functional x* on Wt, such that x*[A(s)] is (i) a constant on [5] < 1, (ii) not a constant. 
Show that in the first case the functional must be a constant for all s and not merely 
in the unit disk. 


. Let the sequence {f,,} be made up of functions which are X-holomorphic in a domain 
D. Suppose that the sequence converges uniformly on compact subsets of D to a 
limit f(s). Show that f is X-holomorphic in D. 


10. With the same notation as in Problem 9, suppose that D is simply connected and its 
boundary OD is a simple closed rectifiable oriented curve I’. Suppose that the func- 
tions f, are strongly continuous in the closure of D and X-holomorphic in D. Suppose 
that the sequence {f,} converges uniformly on Γ΄. Show that there is uniform con- 
vergence (in the strong topology of X) everywhere in the closure of D and the limit 
function f is X-holomorphic in D. 


ΧΩ 


11. With the same assumptions, show the sequence {f,‘?)(s)}, for any fixed positive 
integer p, converges uniformly to f?)(s) on compact subsets of D. 


12. [Theorem of Morera.] If f(s) is strongly continuous in a simply connected domain D 
and if (8.1.20) holds for any choice of Γ as the perimeter of a triangle which together 
with its interior is a subset of D, then f is X-holomorphic in D. 


13. Prove Theorem 8.1.8. 
14. If fis X-holomorphic in D, prove that its zeros, if any,.can have no cluster point in D, 


15. A function f is Yt,-holomorphic in a domain D. Show that f is either algebraically 
singular for all s or the algebraic singularities have no cluster point in D. 


16. S(s) is a polynomial in s of degree m > 0 with coefficients in Wt,. Does S(s) 
necessarily have (1) zeros, (2) algebraic singularities? 


8.2 THEOREM OF VITALI 


We shall give a brief exposition of a result relating to “induced convergence”’ of 
sequences of X-holomorphic functions. The property of being X-holomorphic is 
not necessarily preserved under convergence, but it is so strongly adherent that it 
is preserved if the functions and the mode of convergence satisfy rather mild 
restrictions. Problem 10 of Exercise 8.1 (with ¥ = C) is one of the earliest 
instances of propagation of convergence, in this case inward propagation from the 
boundary to the interior. For complex-valued functions Carl Runge (1836-1927) 
found this theorem in 1884. Examples of outward propagation, from a subdomain 
to a containing domain, are known from the 1890’s. The problem of whittling 
down the assumptions was attacked and brought to a satisfactory conclusion in 
1903 by Guiseppe Vitali (1875-1932). The result is known as Vitali’s Theorem. 
It was rediscovered by the brilliant but little-known American mathematician 
Milton Brockett Porter (1869-1960) in 1904. Actually there are two Vitali theorems 
of which only one lends itself readily to extensions for X-valued functions. The 
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other requires compactness assumptions which are but rarely satisfied in B-spaces. 
The generalized Vitali theorem reads as follows: 


Theorem 8.2.1. Suppose that {f,, is a sequence of functions X-holomorphic in a 
domain D where ||f,(s)\| < M for alln and alls. Suppose there exists a sequence 
{s,+ in D with a cluster point Sq in D such that lim f,(s,) exists for allk. Then 
lim f,(s) = f(s) exists for all s in D, the limit f(s) is X-holomorphic in D, and the 


n7>@ 


convergence is uniform on compact subsets. 


The original proof of Vitali does not generalize, but an alternate proof found 
by Ernst Lindel6f in 1913 does extend to abstract spaces. It deals with the special 
case where D is the open unit disk, |s| < 1, and 80 = 0. From this the general result 
is obtained by routine analysis using the same type of argument as in the proof of 
the Principle of the Maximum. There is no restriction in taking M = 1. Lindeldf’s 
theorem reads: 


Theorem 8.2.2. Suppose that 
LOS") a S12 oa (8.2.1) 
j=0 


where the series are absolutely convergent for [5] <1 and ||f,(s)|| <1, νη, 5. 
Suppose there is a sequence {s,} converging to Ὁ such that 


lim f,,(s;,) (8.2.2) 
exists for allk. Then 
lim f,(s) = f(s) (8.2.3) 


exists for |s| <1, the convergence is uniform for |s| < R <1, any R, and the 
limit is X-holomorphic for [56] <1. 


Proof. The gist of the proof is to show that all sequences {a,;; ined Oe ee ey 
j=0,1,2,..., are Cauchy sequences. Once this has been done for j = 0, it will be 
seen that the same result holds for j > 0 by a suitable induction process. 

We start by observing that 


lla, jl < b, Vj,n (8.2.4) 


by Cauchy’s inequalities. Next we use Schwarz’s Lemma (Problem 7, Exercise 8.1)’ 
Since f,(s) — a, 9 = 0 for s = 0 and ||f,(s) — a,ol| < 2, it follows that 


I|f,(S) — anol < 215], Υ͂Π. (8.2.5) 
This we use to prove that {a,)} is a Cauchy sequence. Now 


ano — Aol = ||f,(5.) — Ano — ,( δ) + f,(5;) τι F,(s,) τ Ayo 
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for any choice of s, in the given sequence. Here the second member is at most equal 
to 


fn (Sk) — Anoll + Π( σι.) — fC Il + {Πρ δ) — apoll 
< 45,1 + Ἐς) — £,¢5,)| 
by (8.2.5). Here we can choose k so large that 
[δι] < ἃς, 


where 815 a preassigned, arbitrarily small positive number. Further, for each fixed k 
the sequence {f,(s,,)} is Cauchy’so there is an N depending upon ε and k such that 
forn,p>N 
(ὦ) — f,(5;,)|| < $e. 
Thus for n, p > N 
llano ~ a,oll < δ. 


Since 8 is arbitrary, this shows that {a,)} is a Cauchy sequence in ¥. We set 


lim a,9 = 80 (8.2.6) 
and note that 
[80] <1 (8.2.7) 


by the continuity of the norm. 
Suppose that we have proved, for j < m, the existence of 


lim a,; = ἃ; (8.2.8) 
and the inequality 
14}} <1. (8.2.9) 


We now define a sequence of functions {f, ,,41(s)} by 
f,(s) — > as? = s™*"f, n41(s). (8.2.10) 
j=0 


Our next aim is to show that this sequence has the same properties as the original 
sequence {f,}, 1.6. convergence for s = s,, Vk, and uniform boundedness in |s|< 1, 
the only difference being that we get a common bound equal to m + 2 instead of 1 
which does not affect the conclusion that {f,.41(0)} = {a,m41$ is a Cauchy 
sequence for fixed m. 
It is clear that lim ἔν κα 1(s,) exists for each k. To prove uniform boundedness 
n> oO 


is more laborious. We choose two numbers r and R,0 < r < R <1. Consider 
s-™"* [f,(s) — )) 8,,)5}} 
j=0 
for [5] = R. Since ||f,(s)|| <1 and |a,;|| <1, Vj,7, the quantity displayed cannot 


exceed 
(m+ 2)R°""} 
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in norm. By the Principle of the Maximum, 

max ||f,, m+1(re”)|| < (m + 2)R7™"1, (8.2.11) 

θ 
and this holds for any fixed r < R. Now the left member is independent of R, so 
we may let R - 1 and obtain 
= If, m+i(te)|| < m+ 2 
or, simply, 
Ι fam+1€5)Il < m+ 2 (8.2.12) 


for all s with [5] <1. This proves the uniform boundedness. 
A new application of Schwarz’s Lemma gives 


ΜΕ i(s) " ἃ, ματα < (m 7 3) 5] 
true for all πὶ and all m. From this point on the argument proceeds as above and we 
conclude that (8.2.8) and (8.2.9) hold also for 7 = m + 1 and hence for all m. 
With the aid of the coefficients a,, we now form the power series 


f(s) = (8.2.13) 


3 
ae 
3 


and aim to show that 
lim ||f,(s) — f(s)|| = 0, (8.2.14) 


the convergence being uniform for [5] < R <1. The series (8.2.13) is obviously 
absolutely convergent for [5] <1, and its sum is X- -holomorphic in this disk. 
We have now 


H(S) -- AO < Yay τ a Is + ΣΟ dag ot! + 5a it 


where [5] < R. Since the coefficients a,, j and a, are at most of norm 1, 


oO 


Σ lal BV < R= (RR, Vn, 
m m+ 


and the same estimate is valid for the second infinite series. 
Here we can choose m so large that 


2(1 — R)-1R"*! < te 


for a given fixed 8. Since lim a,; = a; for each fixed j, we can choose N so large 


n> oO 


that for fixed m andn>WN 


Ms 


tani — a,|| RY < fe. 
Jj 
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Combining, we get 
lf, (s) — £(s)l] < € 


for n> N and [5] < R. Here ε is arbitrary, so (8.2.14) follows. ἢ 


Proof of Theorem 8.2.1. The proof is essentially an exercise in the use of analytic 
continuation together with the identity theorem. Without restricting the generality 
we may assume s, = 0 and that the disk {s; |s| <1} belongs to ἢ. We have then 
seen that lim f,(s) = f(s) exists in the disk and that the convergence is uniform in 


any smaller concentric disk, say [5] « R. Suppose that 5: e D but δι] > R. Wecan 
now join s, to the origin by a polygonal line in D with vertices 


to = 0, ty, fy, ss Gy = δ» 
chosen in such a manner that for each j 

lt;41 — ἢ! < Rd(t,, oD). (8.2.15) 
We have lim f,(s) = f(s) in the disk {s;|s| <1} with uniform convergence in the 
eoneentne-dick {s:|s| < R}. The latter contains the point s = ἡ, together with a 


small neighborhood of this point. In this neighborhood f,(s) — f(s) everywhere. 
Further, ||f,(s)|| <1 for all s in the disk 


Is — ty| < d(t, OD) (8.2.16) 


if we take M =1, which is no restriction. In this disk we have power series 
expansions 


f(s) x b,, ;(s " t,)’ 


analogous to (8.2.1). The radius of convergence of these series is at least d(t,, 0D). 
If this distance is <1, the coefficients are not necessarily bounded uniformly with 
respect to j, but the argument used in proving Theorem 8.2.2 holds with minor 
modifications. Thus we conclude that lim f,(s) = g(s) exists in the disk (8.2.16) 


and the limit exists uniformly in the smaller disk 
[5 anes {4 < Rd(t,, OD). 


But in a small neighborhood of the point s = ἡ, the limit is known to be f(s). Now 
both f(s) and g(s) are known to be locally X-holomorphic so that the identity 
theorem asserts that 

(5) = f(s). 


More precisely expressed, g(s) is the analytic continuation of f(s) in (8.2.16) and 
there is one and only one X-holomorphic function, hereinafter denoted by f(s), 
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that is the limit of the sequence {f,(s)} in the union of {s;|s| <1} with the disk 
(8.2.16). By (8.2.15) the smaller disks of uniform convergence overlap. 

In this manner we can proceed from one vertex to the next. After n steps we 
reach s,. Thus the polygonal line joining s, to 0 is covered by a chain of a finite 
number of overlapping disks in which the sequence {f,(s)} converges uniformly to 
the function f(s). Heres, is any point of D so the limit exists everywhere in D and 
every compact subset may be included in a region of uniform convergence. J 


We note that the validity of the identity theorem for X-holomorphic functions 
implies that the Weierstrass principle of analytic continuation holds for such functions. 
We have also the Theorem of Monodromy: 


Theorem 8.2.3. If a locally X-holomorphic function f can be continued analytic- 
ally along every path in a simply connected domain D, then f is X-holomorphic 
in 1). 


For the classical theorem of monodromy holds for the functionals x*[f]. 
Details are left to the reader. 

We have based the theory of X-analyticity upon that of C-analyticity via the 
functionals. Thus fis X-holomorphic in a domain D iff x*[f] is C-holomorphic in D 
for every x* € X*. This implies that a point s = so on OD where f is not X-holo- 
morphic must be a singular point of at least one (and hence infinitely many) of the 
functions x*[f]. It is by no means necessary that all functions x*[f] have a 
singularity at sy. Thus we see that for a single-valued function f the singular points 
of f are the union of the singularities of x*[f]|, where x* ranges over X*. Similarly, the 
maximal domain of holomorphy of f is the intersection of all the corresponding 
domains for x*[f]. 


EXERCISE 8.2 


1. How should the proof of Theorem 8.2.2 be modified if the series converges for 
[5] <r instead of |s| <1 and ||f,(s)|| < M? 


2. Prove Theorem 8.2.3. 


3. [Theorem of the Phragmén-Lindel6f type.] Suppose that g is X-holomorphic and 
bounded in the right half-plane. Suppose that lim g(r) =a. Show that 


r>+o0 
lim g(re’’®) = a for --π < 0 <4nz, uniformly in any closed interior sector. 


r7?o 


[Hint: Set f,(s) = g(ns) for [5] <1 and apply Vitali’s theorem.] 


4. If g is holomorphic in the right half-plane and there are two rays args = 6,, 
arg s = 0, in this half-plane along which g tends to distinct limits a and b, show that 
g cannot be bounded in any sector containing the rays in its interior. 
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5. [Theorem of F. Carlson, H. Cramér, 5. Wigert type.] If f is ¥-holomorphic and 
bounded in the right half-plane and vanishes at the positive integers, show that 
f(s) = 0. [Hint: Consider the sequence {f(2"s)} and construct a sequence 
{s,}, lim s, = 1 such that lim (27 s,) = 0, V k.] 


n— oO 


8.3 FUNCTIONS ANALYTIC IN THE SENSE OF FRECHET 


The theory of analytic functions on vectors to vectors goes back to Maurice 
Fréchet, who in 1909 introduced abstract differentials, polynomials, and power series. 
Among later contributors to this field we note R. Gateaux, L. M. Graves, I. E. 
Highberg, ΚΕ. 5. Martin, A. D. Michal, J. Sebastiao e Silva, A. E. Taylor, and Max 
Zorn. 

Consider two B-spaces X and 3) over the complex field and a mapping y = f(x) 
from 3 to Y defined in a domain Ὁ c &X. Here ‘‘domain’’ means an open connected 
set: open in the strong topology of X and connected is understood to mean that any 
two points of Ὁ may be joined in D by a polygonal line consisting of a finite 
number of line segments and hence of finite length. We shall be concerned with the 
properties of f(x + ah) where x € Ὁ, he 3, «is a complex variable and || is so small 
that x + aohe Ὁ. We plan to examine the differentiability properties of f(x + ch) 
with respect to a. 

A simple example will make the problem setting clearer and serve to motivate 
the terminology to be used. Take ¥ = L,(—2z,72], 9 = ([--π, π], Ὁ = X, and 


f{(x)[t] = [ [x(s)]? ds, —-mt<t<n. (8.3.1) 


Since 
f(ax) = «7f(x), 


we refer to f(x) as a homogeneous quadratic polynomial in x. Here 


f(x + oh) = Ϊ [x(s) + oh(s)]? ds 


= if [x(s)]? ds + 24 | x(s) h(s) ds + a? I, [h(s)]? ds. 
The difference quotients 
- πα + oh) -- f(x)] and - [f(x + 258) -- 2f(x + ah) + f(x)] 
tend to limits as « > 0, namely 


2 Ϊ h(s)x(s)ds and 2 ! [h(s)]* ds 
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respectively. They are denoted by 
6f(x;h) and 6 £(x;h), (8.3.2) 


and are known as the first and second Fréchet differentials of f at the point x with 
respect to the incrementh. The term “‘variations” is also used. There are, of course, 
also differentials of higher order, but they are all zero in the present case. These 
differentials have remarkable properties. Thus 6f(x;h) is homogeneous of the first 
order both in x and in h, and for fixed x we see that 6f(x; -) e €(X, Y). The second 
differential is homogeneous of the second order in h. Moreover, 


f(x + h) — f(x) — f(x; 8}}} < C((|hll)’, (8.3.3) 


where, on the left, we have the sup-norm, i.e. the norm in 3), and on the right the 
L.-norm of &. 

The reader may object that these properties are obvious if not trivial, but the 
important and not so obvious thing is that they generalize to much more complicated 
situations. 

Some definitions are in order at this point. 


Definition 8.3.1. A function f from Ὁ to Y with Ὁ =X is said to be 
homogeneous of degree n if 


f(ax) = a"f(x), Vaec. (8.3.4) 
Such a function is said to be bounded if there is a finite M such that 
|f(x)|| < M||x|", νχ. (8.3.5) 


The situation covered by this definition may be regarded as an extension of the 
linear case. If T € €(X, 9), then T(x) is homogeneous of degree one in x and is 
bounded, both properties in the sense of Definition 8.3.1 with n = 1. 


Definition 8.3.2. A function f defined on a domain D < & with range in Ἢ is said 
to be Fréchet differentiable at x, € D if (1) 


Aus ¥ [f(xo + Bh) — f(x,)] = 6f(x,;h) (8.3.6) 


exists for allhe X, and (2) δῖ(χο; Β) is a linear bounded mapping from X& to %, 
i.e. δῖ(χο; ) € €(X, Y). 


For the limit (8.3.6) we use the term first Fréchet differential or first variation 
of f at X = Xp with respect to the increment h and we say that f is (F)-differentiable 
for short. There is also the weaker property of being (G)-differentiable, G for 
Gateaux, but this concept will not be discussed here. 

We note that the differential is additive in f for if f = ἢ, + f, and of, and δῖ, 
exist, then by (8.3.6) 

δ + f,] = of, + of,. (8.3.7) 
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The properties of the differential gua function of h lie much deeper. Our point 
of departure is the observation that (8.3.6) is equivalent to saying that 


|< f(x + oh)| = 5f(x:h), (8.3.8) 
da 0 


αι Ξ 
where the derivative is the strong derivative in the space Ὦ. This suggests con- 
nections with the theory of %)-holomorphic functions of «. Our first result is 


Theorem 8.3.1. Suppose that ἴ is a mapping from & into Y defined in a sphere 
S: {x; |x — all < p} where ||f(x)|| < M, V x. Let f be such that for fixedxe © 
and fixed he ¥ the mapping « > f(x + ah) is Y-holomorphic in a for 


Ια! < [p — |x — all] lh] 7 =o. (8.3.9) 
Then qn 
| -ἴ + αι) = 5§(x;h) (8.3.10) 
da a=0 


exists for alln and αἰ χε. There exists a concentric sphere 
So: {x3 x — all < Po}, Po <P, 


in which f is strongly continuous together with all its variations in their dependence 
upon x. Further, for fixed x in So, the nth variation is homogeneous in h of 
degree n and 5£(x;h) is bounded in the sense of (8.3.5) and strongly continuous 
in ἢ. 


Proof. Since f(x + ah) is ¥-holomorphic in « for fixed x and h, it admits of a 
Maclaurin expansion in powers of a, say 


f(x + ch) = f(x) ἘΣ £,(x,h) α". (8.3.11) 
n=1 


Here χε S, he 3, both fixed, and « satisfies (8.3.9). The series converges in norm 
for such values of the arguments. Using formulas (8.1.25) and (8.3.10) we can now 
rewrite (8.3.11) as 


f(x + oh) = f(x) + 3 - 6 £(x; h) a”. (8.3.12) 


n= 


By Cauchy’s formulas for the derivatives 
6 F(x; h) = og Ϊ f(x + Bh) β᾽ "  ἀβ, (8.3.13) 
2πὶ Jr 


where I, which is at our disposal, may be taken as the circle |B] = σὺ « σ. Since 
||f(x)|| < M for all x in S 

| 6 £(x; h)|| < Μηπῖσο ". 
Since this holds for all σρ < σ, we get 


[δ (χα; b)|| < Mnio™", Vn, (8.3.14) 
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and the series (8.3.12) converges in norm for |a| < o. Here x is any point in © andh 
is arbitrary in ¥. It should be noted that o depends upon the choice of x and of h. 
We now choose Gy, as the sphere {x; ||x — al] < 4p} and restrict ἢ temporarily to 
satisfy ||h|| <4. We can then take o = 1 and (8.3.14) gives 


δ (χα; Β}}} < Mn!, χε So, |lhl| < 4p. (8.3.15) 


The series (8.3.12) now converges for [ἃ] < 1 
The homogeneity and boundedness properties of 6° f(x; h) with respect to ἢ 
in the sense of Definition 8.3.1 also follow from (8.3.13) if we observe that the path 
of integration is at our disposal as long as it surrounds β = 0 once in the positive 
sense and x + fh stays in ὦ along the path. Replace h by yh, where y is any 
complex number 40, and take |yf| = o as the new path of integration. A simple 
calculation gives 
δ (χ; yh) = y"OF(K;h), νη. (8.3.16) 


Thus 6“f(x;h) is homogeneous in h of degree ἡ as asserted. 
A particular choice of y gives the boundedness property. Restrict x to the 
sphere ©, and take 


= Ζξρ(! }})} ". 


δ (χ; yh) = (4p)" {8} "δ £Cx; h). 
By (8.3.15) the left member is at most Mn! so that 


Then ||yh|| = 4p and 


2 n 
|| 6 F(x; h)|| < M (=) n! [8}" (8.3.17) 


for all n, all x in Go, and all he &. 

We shall use these inequalities to estimate various approximations of 
f(x + oh) — f(x) by the Maclaurin series (8.3.12). Here xe So, ||h|| < 4p, and 
Ια] <1. For such values 


[πὰ + ah) ~ ΤΟ} <M Y Ε imi) 


Summing the geometric series and simplifying, we get 


f(x + ah) — f(x)|| < M ΒΕ ἐς ΜΒ. (8.3.18) 
p — 2\a| |b] 
In the same manner we get 
Ι Μ 4lq| ||hll? 
— [f(x + oh) — f(x)] — of(x;h) — SE, (8.3.19) 
ὰ p — 2|α} 1Ἀ] 


The first of these inequalities is a Lipschitz condition for f and shows that f is 
continuous for ΧΕ Sy. This has important implications. It is now seen that the 
integrand f(x + fh) in (8.3.13) is not merely bounded, but strongly continuous in x 
and inh. This holds with respect to x in Sy for any fixed ἢ ε X and with respect to ἢ 
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for any fixed x in Gp. All we have to do is to shrink the radius of I to take care of 
the shifting needs. The continuity of the integrand now implies the continuity of 
the integral as a function of x and h, i.e. 6° f(x; h) is strongly continuous in x and ἢ. 

The significance of (8.3.19) will become evident below. We observe here that it 
implies that 


he eee f(x) — df(x;h)|| = 0 (8.3.20) 
[\hj]—>o | |b] 


for all x € So. a 


Theorem 8.3.1 tells us very much about the properties of f and its variations 
which are consequences of local boundedness and ¥)-holomorphy with respect to « 
of the mapping a > f(x + ch). It looks as if f is (F)-differentiable in the sense of 
Definition 8.3.2 at all points of the sphere Sg, but the proof is not complete: 
of(x;h) has been shown to be bounded and homogeneous of degree one as a 
function of h, but we still have to prove additivity inh. This also follows from the 
theory of Y)-holomorphic functions, but in this case of two complex variables rather 
than one. Since we are not planning to go deep into the theory, this difference is 
not essential. Let hy and h, be arbitrary elements of X and «, and «, small complex 
numbers. The assumption that α > f(x + oh) is a 9)-holomorphic mapping then 
implies that the mapping 

(1, α2) > f(x + oh, + oh) (8.3.21) 


is Y-holomorphic in a as well as in α. We take this as definition of the meaning 
to be assigned to the statement that f(x + «,h, + «,h,) is Y-holomorphic in 
(α,, α2} in some neighborhood of the origin of the space C”. In analogy with the 
situation in Οἱ we have now a Maclaurin series expansion 


f(x + ah, + ohn) = Σ᾽ DY) αιαχ μία; hy, hy). (8.3.22) 
jJ=0 keO 


Here the coefficients f;, are rational multiples of partial derivatives of the left 
member with respect to a, and a,, evaluated at (0,0). This is the analogue of 
(8.3.10). We have also analogues of (8.3.13) with double integrals rather than single 
ones. We have 


foo() = f(x), fyo() = Of(x;hy), ἴοι Ὁ = of(h; x,). (8.3.23) 
The other coefficients are mixed variations and will not enter into our discussion. 
We now set a, = % = in (8.3.22). On the one hand, we get 


a 
f[x + a(h, + h,)] = f(x) + d πε δι f(x; ἃς, + hy) a” (8.3.24) 


n= 


by virtue of (8.3.13). On the other hand, 
f[x + (hy + hy)] = f(x) + o[df(x; hy) + 6f(x;h,)] + --- 


where the ellipsis (---) indicates a power series in « starting with terms of the second 


8.3 FUNCTIONS ANALYTIC IN THE SENSE OF FRECHET 269 


degree. By the identity theorem for power series the coefficients of equal powers 
must be equal. Hence we have 


Of(x;h, + h,) = df(x;h,) + 6f(x;h,). (8.3.25) 
This is the required relation and we have proved 


Theorem 8.3.2. Under the assumptions of the preceding theorem df(x;h) is a 
linear bounded function of ἢ. For fixed x € Sg the first variation of f is a linear 
bounded transformation on X to %, i.e. df(x; )ε E(X, Y). 


Corollary. f is (F)-differentiable in the sense of Definition 8.3.2 for all x in Sp. 


A few more definitions will enable us to bring this discussion to a satisfactory 
conclusion. 


Definition 8.3.3. A mapping x > f(x) defined forx€ Ὁ c & with range in Ἢ is 
said to be locally bounded if for each ae D there is a sphere 


S(a): {x; |x — al] < p(a)} 
and a positive number M(a) such that ||f(x)|| « M(a) for all x ε S(a). 


Definition 8.3.4. The mapping x — f(x) from X to Ἢ is said to be (F)-analytic 
in the domain D of & if it is single-valued, locally bounded, and (F)-differentiable 
in ἢ. 


We have now 


Theorem 8.3.3. A mapping x > f(x) from X to Ἢ is (F)-analytic in a domain 
D c ἃ, if for eacha < D there exists a sphere S(a) in which the conditions of 
Theorem 8.3.1 hold. 


Formula (8.3.20) used to be postulated as one of the conditions for 
(F)-differentiability. It was shown by Max Zorn in 1946 that this relation is a 
consequence of the existence, boundedness and linearity of df(x;h). It implies a 
type of uniform (F)-differentiability which may be compared with the correspond- 
ing situation for Y-holomorphic vector and operator functions as exemplified by 
(8.1.10). 

Some further properties of (F)-analytic functions will be studied in the next 
section. 


EXERCISE 8.3 


1. Verify (8.3.7) and (8.3.8). 
2. Verify (8.3.14) and (8.3.15). 
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. Verify (8.3.16) and (8.3.17). 

. Verify (8.3.18) to (8.3.20). 

. Express f;,(x; hy, h) as (1) a partial derivative, (2) a double integral. 

. Use the double integral to find an estimate of f;, analogous to (8.3.14). 
. What happens to f,,(x; hy, h,) if hy = h, = h? 

. Express 6 £(x; h, + h,) in terms of fo, ἔι1» and fo. 


. Show that f,,(x; h,, hy) = 6 f(x; h,, h,) is the first variation with increment h, of 


the first variation of f with increment h,, and vice versa. 


If ἢ = C and fis a linear bounded functional on X, prove that fis (F)-analytic for 
all x and determine its first variation. 


In the remaining problems (No. 22 excepted) X = Ὦ = 8 is a non-commutative 
B-algebra with unit element e. 


11. 


12. 


13. 


14. 


15. 


16. 
17. 


18. 


19. 


20. 


21. 


If f(x) = Σ ρα, x" where «,€C, x < B, x° = e, and if the series converges in 
norm for ||x|| <p, show that its sum is (F)-analytic in this sphere. 


Compute df(x; h) and discuss the convergence of the series. Remember that 4 and x 
do not normally commute. 


Determine spheres of absolute convergence for the two binomial series 
co /1 γὼ: ΕἸΣ 
roy =F (ἢ ς΄ δῶ =F (52) a" 
0 \4 0 \ 2 
Show that when the series converge their product in either order is the unit element of 


8 so that they are inverses of each other. 


With the same notation, find df(x; h) and dg(x; ἢ). When do the series expansions 
for the differentials converge? 


Find the second differential of f with respect to A. 


Prove that 
f(x + ah) = f(x) flag) A) 
for suitable domains for «, ἢ, x and determine such domains. 


Verify that [fo]? = e + x when the series converges, so that f(x) defines a square 
root of e + xin its domain of convergence. How can the validity of this relation be 
extended by analytic continuation? 


Similarly, g(— x) should be the square root of (e — x)” 1 — R(1, x). For what values 
of x is this true? 


If D is a domain in 8 and if for a fixed complex ἃ the resolvent R(«, x) exists for all 
x in 8, show that R(a, x) is (F)-analytic in Ὁ. 


Find the first variation of R(«, x) for x in D. 
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22. For general B-spaces show that if f is (F)-analytic in a domain D c X, then for each 
ἃ. Ε Ὁ there is a sphere G(a) in which f satisfies the conditions of Theorem 8.3.1, 
i.e. prove the converse of Theorem 8.3.3. 


8.4 SOME PROPERTIES OF (F)-ANALYTIC FUNCTIONS 


In view of the close relationship between (F)-analyticity and 2)-holomorphism it is 
to be expected that some of the results of Sections 8.1 and 8.2 will carry over to the 
(F)-analytic case. We have already seen that the Cauchy integral and related tools 
carry over. There are some essential differences, however. This is demonstrated 
in a striking manner by the following extension of Theorem 8.1.8. 


Theorem 8.4.1. If f is (F)-analytic in a domain D < & and if f(x) =0 in a 
sphere S < Ὁ, then f is identically 0 in D. There are (F)-analytic functions 
which vanish on a linear subspace of X but do not vanish identically. 


Proof. Let S be the sphere {x; |x — al] < p}. Since f is (F)-analytic in D there 
is a convergent Taylor series 


f(a + ah) = f(a) ἘΣ — 5£(a: h). (8.4.1) 
n=1 ft. 


Here f(a + ah) is Y-holomorphic for small values of |x| and of ||h||. For such values 
f(a + ah) = 0 so by Theorem 8.1.8 


f(a) = 0, O™f(a;sh)=0, νη. (8.4.2) 


This shows that f(x)=90 not merely in © but in the concentric sphere 
So: {x; |x — all < d(a, 0D)} since the series (8.4.1) converges trivially in Gy to the 
sum zero and represents f there. 

If So does not exhaust D, we use an analytic continuation argument. If be Ὁ 
but is not in Sg, we choose a finite number of points 


ἂρ = ἃ, a;, az, ..., a, = ὃ, 
such that for each j 
laj+1 — ἃ} < d(a,, 0D). 


We set Θ᾽: {x; |x — a,|| < d(a;,0D)}. Then f(x) = 0 everywhere in Sy. There is a 
sphere with center at x = a, which is contained in GS, and in this sphere f(x) = 0. 
It follows by the argument used above that f(x) = 0 in all of S, and so on. Ina 
finite number of steps it is found that f(x) =0 in G, and in particular f(b) = 0. 
Since b is an arbitrary point in D, it is seen that f(x) = 0 in Ὁ. 

To prove the final assertion of the theorem we recall that a linear bounded 
functional f(x) ε X* is (F)-analytic (see Problem 10, Exercise 8.3). If 3 is a linear 
subspace of X which is not dense in X, then there exists a linear bounded functional 
f such that f(x) = 0 for all x in Xp while || f|| = 1 so that fis not identically zero. Ε 


The Principle of the Maximum is valid for (F)-analytic functions. 
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Theorem 8.4.2. If f is (F)-analytic in the domain D and if 


eur |f(x)|| = M, (8.4.3) 
then either 
f(x) || < M (8.4.4) 
or 
f(x) || = M (8.4.5) 


everywhere in Ὁ. 


Proof. Suppose that there is an ἃ € D such that ||f(a)|| = M. Then by (8.3.13) 

l 

f(a) = το | fla + βι ΒΓ 16, 

2πὶ Jr 
where I is any circle such that || Bh|| < d(a, 0D). This gives 

1 2n 

f(a) = — ] f(a + ρεἶμ) dd (8.4.6) 
2π re) 


and 1 p2n . 
If(@) < Ϊ "Ila + pe’) a8. (8.4.7) 


Here the left member equals M while in the right member the integrand is <M for 
all 0. If the average of such a function is to be M, then by the continuity of 
f(a + pe'h)|| in 0, we must have ||f(a + peh)|| = M for all admissible p, 6 and h. 
In other words, (8.4.5) holds for all x with ||x — ἃ] < d(a,dD). The extension to 
all of D is then obtained by the now-familiar chain argument and is left to the 
reader. §j 


The theorem of Vitali is also valid for (F)-analytic functions, but both the 
assumptions and the conclusions have to be modified. In Section 8.2 we spoke of 
“induced convergence’’; here a much more substantial “‘induction basis”’ is required. 


Definition 8.4.1. A sequence {f,} of functions (F)-analytic in a fixed domain 
D c X is said to be equilocally bounded in Ὁ, if for each ae D there is a sphere 
©(a) and a positive number M(a) such that 


\f,(x)|| < M(a), VxeG(a) Vk. (8.4.8) 
We have now the following analogue of Vitali’s theorem. 


Theorem 8.4.3. Suppose that the sequence {ἴω} is made up of functions which 
are (F)-analytic and equilocally bounded in a fixed domain > < ζ and suppose — 
that 
lim f,(x) = f(x) (8.4.9) 
k-> 0 


exists in a sphere Ὁ <D. Then the limit exists everywhere in D and is 
(F)-analytic. Further, for all n, allxéD and allhe X 


lim 6£,(x;h) = δ £(x; bh). (8.4.10) 


k> 0 
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Remark. Note that the theorem does not assert uniform convergence as in 
Theorem 8.2.1. 


Proof. Since f, is (F)-analytic, f,(x + ah) is Y-holomorphic in « for χε D and 
he X, both fixed. We then have a sequence {f,(x + o«h)} to which Theorem 8.2.1 
may possibly apply. We explore this possibility and play back and forth between 
(F)-analyticity and ¥-holomorphism to obtain the desired result: 

Let a be the center of the sphere S. Without restricting the generality we may 
assume that © coincides with the open sphere S(a) which exists by assumption. For 
x + oh in ὦ we have | 


lim f,(x + oh) = f(x + ch), 


h-> 0 


again by assumption. This says that for xe © and any fixed ἢ the mapping 
α —» f(x + oh) is Y)-holomorphic in ἃ in some neighborhood of « = 0. This implies 
the existence of f(x; h) for x ε G, and since ||f(x)|| < M(a) in ὦ we conclude that f 
is (F)-differentiable and hence (F)-analytic in ὦ. 

If b ε D but is not in S, we can join a to b by a polygonal line ¥ in D. Each 
point x on δ is the center of a sphere of equilocal boundedness and since Ἵ is 
compact in X, it follows that a finite number of these spheres will form a covering 
of %. We can choose these spheres S,, S,..., S, so that S$, = ὦ and the center a; 
of ©; is interior to ©;_, for j = 2, 3, ..., ἡ and a, =b. A uniform bound of the 
sequence f,(x) in the union of the spheres is M = max M(a)). 


J 
We now proceed to the point x = ἃ), which by assumption is interior to 
©, = G. It follows that there is a small sphere concentric to S, in which lim f,(x) 


k-> οὦ 


exists and equals f(x). Take any he X and consider the functions f,(a, + oh). If 

the radius of S, is 72. then each mapping « > f,(a, + ah) is Y-holomorphic in « 

for |a| <r, {8} and uniformly bounded in G,. It follows that lim f,(a, + oh) 
k-> oo 


exists and must define f(a, + oh) for such values of α. Here f(a, + ah) is 2)-holo- 

morphic as the limit of a uniformly convergent sequence of such functions. Since h 

is arbitrary, f(x) exists everywhere in S,. It is (F)-differentiable because the exist- 

ence of lim f,(a, + ah) implies also the uniform convergence of the partial 
k—> 00 


derivatives with respect to «. This shows the existence of 6f(x;h) as well as the 
boundedness and linearity of the variation. Since ||f(x)|| « Μ in S,, we conclude 
that f is (F)-analytic in S,. Since the spheres S, and S, overlap, f in ὦ, and fin 
S, are local representatives of the same (F)-analytic function. 

This argument can now be repeated for the following spheres ὥς, Gy, ..., S, 
and shows the existence of f as an (F)-analytic function in the union of the spheres. 
This conclusion then extends to all of D. We recall that 


lim f,(x + oh) = f(x + oh) 


k-> a 
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also implies on on 
lim — f,(x + oh) = 7 f(x + coh), Vn, (8.4.11) 


k>- οὐ On 


in some neighborhood of « = 0. In particular this is true at. « = 0, which 15 


(8.4.10). J 


EXERCISE 8.4 


1. Suppose that {f,,} is a sequence of functions (F)-analytic in some domain D c X 
but not necessarily equilocally bounded. Suppose that lim f,(x) = f(x) exists 


k~ 0 
everywhere in D. Show that f(x) is piecewise (F)-analytic in the following sense. 


Every sphere in D contains a subsphere in which f,(x) is uniformly bounded and in 
which consequently f(x) is (F)-analytic. 

A mapping x — P(x) from X to ἢ), defined for all x, is said to be a poly- 
nomial in x of exact degree n if, for all a, ἢ Ε X and all complex numbers «, 


n 
P(a + ah) = > P,(a, h) o*, P,(a, h) # 0, 
k=0 
where the P,(a,h) are independent of «. P(x) is homogeneous of degree n if 


P(Px) = β P(x). 


2. If P is homogeneous of degree n as well as a polynomial of exact degree n, what are 
the homogeneity properties of the P,(a, h)? 


3. Show that P,(a, h) = P(a), P,(a, h) = Ph). Show that 6“ Pca; h) exists for all 
a, hand & with all variations being zero fork > n. The variations are not necessarily 
bounded functions of a and h in any sphere. Show that a necessary condition for the 
boundedness of the mth variation is that P(h) is bounded in the sense of Definition 
8.3.1. 


4. Let ὦ bea primitive (n + 1)th root of unity. “‘Primitive’’ means that no power of ὦ 
with an exponent <n +1 can equal unity. Form the system of equations 
n 
Pa+aoh)= δ᾽ P,aho”, jf =0,1,2,...,0 
k=0 
Show that this system has a determinant different from zero and may be solved for 
the P,(a, h) in terms of the P(a + οὐ h) linearly and with constant coefficients. 


5. Use this result to show that the boundedness of P(x) in some sphere ||x|| < r 
implies the boundedness of all variations. Show that this implies that P is strongly 
continuous and (F)-analytic. 


6. Suppose that P is homogeneous of degree ἡ" and bounded in the sense of Definition 
8.3.1. Show that P is continuous and (F)-analytic. 
A mapping x — f(x) from a commutative B-algebra with unit element into itself 


8.4 
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is said to be (L)-analytic (““L” for E. R. Lorch) in a domain D if for each x € D there 
is an element f(x) of the algebra such that for |{A|| small 


| Fe +h) — f(x) — Γ΄ ΟἹ = ο([Ἀ]}}. 


. Prove that an (L)-analytic function f is (F)-analytic and df(x; h) = ΚΟ]. 
. Let x€ 8, a, €B, π. Suppose that 


I(x) = Σὲ a, x" 


converges in norm for ||x|| « γ. Show that fis (L)-analytic and find f ’(x). 


. If X = B is the B-algebra C[0, 1] with the sup-norm, give examples of (L)-analytic 


functions in this algebra. 
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9 BANACH ALGEBRAS 


In this chapter we give a more elaborate discussion of Banach algebras, a topic 
outlined in Section 2.6. We are chiefly, but not exclusively, concerned with non- 
commutative B-algebras with unit element. Most of what is to be done may be 
characterized as analysis in a B-algebra. Here the theory of the resolvent is 
fundamental, and we study in some detail its properties, the related functional 
equations and the nature of its isolated singularities. The resolvent is basic for the 
theory of a wide class of Fréchet analytic functions from 8 to 8 where it plays a 
role analogous in importance to that of the Cauchy kernel in classical analytic 
function theory. 

Part of this work is extended to B-algebras without unit element where the role 
of the resolvent is taken over by the so-called dissolvent and the circle product 
xoy=x+y-— xy plays an essential role. We also include a treatment of Gelfand’s 
representation theorem for commutative B-algebras with unit element via the 
maximal ideals along traditional lines. 

There are four sections; Review; Resolvents and dissolvents; Gelfand’s 
representation theorem; and (F)-Analytic functions and the operational calculus. 


9.1 REVIEW 


The theory of Banach algebras, B-algebras for short, was sketched in Section 2.6. 
Here we shall give a brief review of what has already been given to serve as the 
foundation for further development. To this is added a brief discussion of the 
basic modifications required if the algebra has no unit element. | 

An algebra, as we use the term, is a collection of elements which is closed under 
the operations of addition, scalar multiplication, and multiplication, subject to the 
A-, S-, and M-postulates of Sections 2.1 and 2.6. A B-algebra % is an algebra 
which is also a Banach space and where the norm satisfies 


xv] < [xl] yl. (9.1.1) 
The algebra is said to be commutative if multiplication is commutative 
xy = yx (9.1.2) 
for all x, y. A non-commutative algebra may contain a set of elements z such that 


ΧΟ ΣΧ (9.1.3) 
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for all xe B. This set {z; xz = zx, V x € B} is called the center of the algebra. An 
element e such that 
xe = ex =X, Vx, (9.1.4) 


is called the unit element of 8. There is at most one such element. We recall that 
the normed metric of 8 may be supposed to satisfy 


lel] = 1. (9.1.5) 


If B has a unit element, then some elements x of 8B have inverses, i.e. there 
exists a y€ 8 such that 
xy = yx =e. (9.1.6) 


To each x there is at most one y satisfying this relation. If such a y exists, we write 
γεχ (9.1.7) 


and refer to x as an invertible or regular element of B. It is clear that χ 1 is also 
regular and (x"')~' = x. The set ὦ of regular elements of B is algebraically a 
group, and topologically an open set in 8. In particular, the open sphere 


Ix — ell <1 (9.1.8) 


is contained in ©. The set G is not bounded and need not be connected. 

A non-regular element x of % is called singular. The set Ὁ of singular elements 
is unbounded but connected via the origin since constant multiples of singular 
elements are also singular. Moreover, x ε S, y € B implies xy and yx in G. If the 
product xy or yx should be zero while neither x nor y is zero, then x and y are called 
divisors of zero. Such elements are in GS. The element x is idempotent if 


a θι (9.1.9) 


This equation is satisfied by x = e. Any other solution isin GS. An x in ὦ is said to 
be nilpotent of order p if 


Ke = '0 (9.1.10) 


and p is the least integer with this property. It is guasi-nilpotent or topologically 
nilpotent if 
r(x) = lim [χη 1 = 0. (9.1.11) 
With respect to a given element x of 8, the complex numbers fall into two 
disjoint complementary classes: the resolvent set p(x) and the spectrum a(x), 
according as Ae — x is regular or singular. The former set is unbounded and open, 
the latter bounded and closed. Neither set need be connected, and o(x) may 
separate the plane. 
Here the quantity r(x) of formula (9.1.11) plays a decisive role. The existence 
of the limit was asserted in Theorem 2.6.4 but without a proof. We shall supply a 
proof now. 


Proof of Theorem 2.6.4. The existence of the limit is trivial if x is nilpotent since 
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then (9.1.10) holds for any exponent > the order of nilpotency. Suppose now that 
x" £0 for all n. Since 
xP" < {χη} Hx", 
we see that 
a, = log ||x"|| (9.1.12) 
defines a so-called subadditive sequence, 1.€. 
Onn ΞΞ Gm + Ay (9.1.13) 


for all m and n. The property to be proved is the existence of 


lim —" = ὁ, (9.1.14) 
ee 
where —co <b < οὐ. Intrinsically this fact has nothing to do with powers of x in 
a B-algebra. It is simply a property of subadditive sequences as such. See Section 
13.1. Suppose that a quantity ὁ is defined by 


inf —” = δ. (9.1.15) 


n ἢ 


The infimum exists but may possibly be — co. Here we assume ὃ > — οὐ and prove 
that (9.1.14) holds. By the definition of the infimum there is an integer m such that 
for given ¢ > 0 


a 
— <h-+ &. 
m 


Suppose now that n = km +p withO <p<m. Then 


bc fn ὄμπερ, Καμ + Op κι Gm | op 
~n kmt+tp km+p km+pm_ = km+p 


Here the last member has a limit < ὁ + eat Καὶ > 00 since a,, dz, ..., A,—1 are fixed 
numbers. Since ¢ is arbitrary, (9.1.14) holds. The case where ὦ = — oo is left to 
the reader. 

We now return to (9.1.12) and see that 


] 
lim — log ||x"|| = lim log ||x"|1" = 8, 
hh 


so that 
r(x) = lim ||x"||'/" = exp (6), 
where, if ὁ = —oo, we replace exp (6) by 0. §j 


We note that 
0 « r(x) < |x]. (9.1.16) 


This quantity r(x) is known as the spectral radius of x because, as was shown in 


9.1 REVIEW 279 


Section 2.6, the spectrum o(x) is confined to the disk 
Ιλ] < r(x) (9.1.17) 


and there is at least one point of o(x) on the rim of the disk. This implies, among 
other things, that the resolvent set p(x) contains |A| > r(x) and for such values of λ 
Ξ 6 Oe 
R(A,x) = de-—x)'=—+ > =a, (9.1.18) 
A nara 

where the series converges in norm and r(x) is the precise radius of convergence. 
This brings an end to our review. We add a short survey of the case where the 
algebra 8 lacks a unit element. While a unit element may be adjoined to the 
algebra, we shall not use this device. Instead we base our discussion on the 

composition 

xoyp=x+y—xy (9.1.19) 


[S. Perlis (1942), N. Jacobson (1945), with many followers]. The composition is 
associative, distributive over addition, and is commutative iff 8 is commutative. 
Here x = 0 plays the role of “‘neutral’’ element 


x0oQ0=00x =x, V x. (9.1.20) 


We can now ask about inverses with respect to the circle operation. If x 38 and 
there exists a (necessarily unique) y € 8 such that 


xoy=yox=0 (9.1.21) 


we shall call y the reverse of x (quasi-inverse and adverse are also used) and x is 
reversible. We write x for the reverse of x when it exists. The set of reversible 
elements forms a group R in % under the circle operation since 


xoyoy ox =0. 


Further, ® is an open point set in 8 containing the unit sphere 


|x|] <1 (9.1.22) 

and for ᾿χῇ <1 συ 
eS ΣΧ: (9.1.23) 

n=1 


since the series converges in norm and 
1°. 6) oO 
x+ - ‘3 " -x|- >, | ΞΞ Ὁ. 
n=1 n=1 
Again we make a disjunction of the complex numbers according as x/A is 
reversible or not. Here 4 =O plays an exceptional role and requires special 


conventions, for which see the Exercise below. If x/A € ®, then Δ is said to belong 
to the dissolvent set 6(x), otherwise to the spectrum a(x). The domain 


r(x) «ΙΔ (9.1.24) 
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is a subset of d(x) and for such values of λ the reverse of x/A, denoted by D(A, x) and 
called the dissolvent of x, is given by 


D(A, x) = — y Are (9.1.25) 


Whenever D(A, x) exists we have 
x + AD(A, x) = xD(A, x) = D(A, x) x. (9.1.26) 


The circle operation is definable in any B-algebra. If the algebra has a unit 
element e, then we can consider simultaneously inverses and reverses. Note that 


(e—x)(e—y)=e iff x+y—-—xy=0. (9.1.27) 
Thus e — x is invertible iff x is reversible. Applying this to x/A, 1 4 0, we get 
D(A, x) = — xR(QA, x). (9.1.28) 
The relation between the spectral sets we express by writing 
a(x) = o(x) [mod 0], (9.1.29) 


meaning that any A, 4 # 0, which belongs to one of the sets also belongs to the 
other, while ἡ = 0 may belong to either set without necessarily belonging to the 
other. 


EXERCISE 9.1 


1. Show that ὁ = — oo is a possibility for a subadditive sequence. 

2. If B = (Ια, δ] and fe B, describe p[ 7] and o[ f] and determine r(/). 

3. Give an example of an element of C[a, δ] whose spectrum separates the plane. 
4 


. Find the dissolvent of an element f of C[a, δ] and discuss the role of Ὁ = 0. Find 
necessary and sufficient conditions for 4 = 0 to be in the dissolvent set. 


. Give an example of divisors of zero in C[a, 6]. 
. Are there any nontrivial idempotents in C[a, b]? 


. Can the matrix algebra Wt, contain a quasi-nilpotent element which is not nilpotent? 


oOo - KN οι 


. L,(— π, ΠῚ) is a commutative B-algebra under convolution and without unit element 
(see Theorem 4.3.11). In this algebra define the circle product by fog =f+g — 
f*g and determine the corresponding spectrum of f in terms of its Fourier 
coefficients 


f, = ea [ f(s) exp (— ins) ds. 
ἌΠΟ ΤΟΣ 


Find idempotent elements of this algebra. 


9. Find the Fourier series of the dissolvent of ἡὶ 
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10. Prove that the circle product is associative, distributive over addition, and com- 
mutative iff 8 is commutative. 


11. Show that nilpotent and quasinilpotent elements are reversible. Find the spectrum 
a(x) of such an element. 


12. If 7) ΕΒ, show that joj = j. 
13. Prove that 8R is an open set in 8. 


14. The convention with regard to A = 0 is as follows. The origin belongs to 6(x) iff 
there exist elements ἡ and y of B® such that j? = j, jx =xji =x, jy =yi = y, 
xy = yx = j. If this is the case, D(O, x) = ἡ by definition. Show that under these 
assumptions 


DA,x)=j+ Σ »"} 
n=1 
satisfies (9.1.26) for [1 r(y) <1. 


15. Show that d(x) is an open set in the complex plane. 


16. If λοε d(x), Ap # 0, obtain an expansion for D(A, x) in terms of powers of 
(A — A9)/A and discuss its domain of convergence. 


9.2 RESOLVENTS AND DISSOLVENTS 


We start with a discussion of resolvents assuming the B-algebra to have a unit 
element. If now A € p(x), then the resolvent of x is defined as the inverse of Ae — x: 


R(A, x) = Ge — x)7},7 A€ p(x). (9.2.1) 
It follows that 
(de — x) R(A, x) = R(A, x) Ae — x) = ε. (9.2.2) 


Suppose now that 4 and μ belong to p(x) though not necessarily to the same 
component of the open set p(x). We have then | 


(le — x) RV, x) = 6, Κίμ, x) (ue — x) =e. 


Multiply the first relation on the left by R(, x), the second on the right by R(A, x), 
and subtract. The result is 


(u — A) R(u, x) RA, x) = RA, x) — R(u, x), (9.2.3) 
known as the first resolvent equation. Similarly, if A € p(x) αὶ p(y), then 
(Ae — x) RV, x) = é, R(A, y) (Ae — y) = ἐ. 


Multiply the first equation on the left by Κλ, y), the second on the right by 
R(A, x), and subtract. The result is the second resolvent equation 


RA, y) (ὦ — x) RO, x) = RV, y) — RO, x). (9.2.4) 
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In the first equation A and y are the essential variables and x plays a sub- 
ordinate role. In the second the situation is reversed: now x and y are essential and 
A is secondary. 

Let us now consider the functional equations 


(u — A) R(w) RA) = Κῷ — RW), (9.2.5) 
δ.) ὦ — x) 50) = S(y) -- SQ), (9.2.6) 


per se as functional equations. In the first case Κλ) is a mapping from C into B 
defined in some domain A of the complex plane with values in the B-algebra 8, and 
(9.2.5) is assumed to hold for all A, μ in A. In the second case the mapping goes 
from 8% into itself and (9.2.6) is assumed to hold for all x, y in some domain D 
(= open connected set) of 8. We ask: What are the a priori properties of these 
functions qua solutions of the functional equations? For this general type of 
question, see Section 14.1. | 

We start with (9.2.5). 8 may be non-commutative but (9.2.5) shows that R(A) 
and R(j) must commute whenever they are defined. We started out with resolvents 
in deriving (9.2.5). Now it is clear that (9.2.5) will be satisfied by resolvents, but 
we have no right to assume that all solutions will be algebraically regular every- 
where in the domain of existence or even anywhere. This suggests that the obvious 
crude method of solving (9.2.5) for R(A) is likely to fail. But we can use a method of 
successive substitutions which together with a boundedness assumption leads to 
results. It will be shown that R(A) is locally 8-holomorphic. 

Suppose that R(A) is defined in some domain A of the complex plane where it 
is bounded, ||R(A)|| < M, and satisfies (9.2.5). Suppose «eA and set R(«) = a. 
Here ae 8 is arbitrary and serves as an initial value which will determine R(A) 
uniquely. Set up = a in (9.2.5) and note that 


RQ) =a+t (a — A) aR(A). 
By repeated substitution 


RA) = Yo ἃ -- 2. αὐ ἃ -- ata} Κι. 


If now 
la — Al r(a) <1, (9.2.7) 


we can pass to the limit with n since the remainder goes to zero and the expansion 
RA) = Σ ἃ -- 2}: αἱ! (9.2.8) 
k=0 


is obtained. This series defines a 8-holomorphic function of A. Since A is 
connected, the extension of R(A) as ἃ B-holomorphic mapping to all of A is routine 
analysis. 

As stated above, a ε 8 is arbitrary and may be regular or singular. If ae GS, 
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then R(A) ε S for all A satisfying (9.2.7) for 
RA) =a Y (a — 2} α' 
k=0 


is the product of a € S with an element of 8. Hence R(A) is singular for all A in the 
disk (9.2.7). By analytic continuation with respect to A this will hold all through A. 
For any continuation will be of the form | 


R(A) = = (β -- aye, δ.- R(P), (9.2.9) 


where δ ε ὦ. This implies that R(A) is singular in the domain of convergence of the 
series (9.2.9) and hence everywhere in A. A similar argument shows that if ae © 
instead, then so does R(A) for all AEA. 

A peculiar feature of solutions of (9.2.5) which are resolvents is that R(A, x) is 
$-holomorphic in each of the components of the open set p(x). There may be 
countably infinitely many components. Nevertheless, the values of R(A, x) in 
distinct components are related by (9.2.5). Thus we have the remarkable 
phenomenon that distinct 8-holomorphic functions may satisfy the same functional 
equation and their values in distinct components are still linked by the very same 
equation. 

We shall now turn to the second functional equation (9.2.6). Here we assume 
the existence of a mapping x > S(x) from 8 to % defined in a domain D of 8 where 
it satisfies (9.2.6) and is locally bounded in norm in the sense of Definition 8.3.3. 
We shall show that S(x) is (F)-analytic in D in the sense of Definition 8.3.4. Set 
y = a in (9.2.6) so that 


S(x) = S(a) + S(a) (x — a) S(x). (9.2.10) 


By repeated substitution . 
50) = S@@) + YL - a) S@Y} + [S@ 6. -- Ω7" "509. 


Since in general 8 is noncommutative, we cannot group together the powers of 
(x — a) or the powers of S(a). Assuming 


lx — all ||S(@|| <1, (9.2.11) 


we conclude that the remainder goes to zero by the local boundedness of S(x) and 
we obtain the expansion 


S(x) = S(a) + S(@) pa [ἰὼν (9.2.12) 


This shows that S is (F)-analytic. To start with, this holds for x in the sphere 
defined by (9.2.11) and then by analytic continuation in Ὁ. Note that if a point 
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x = b has been reached by this process, then we have 
S(x) = S(b) + S() δ᾽ E(x — δ) 5(6)7" (9.2.13) 
k=1 


in some neighborhood of x = ὁ and this shows that the (F)-analytic character is 
preserved. The formulas show, in addition, that if the initial value S(a) of S(x) is 
algebraically singular, then S(x) has this property everywhere in D. On the other 
hand, if S(a) is regular, so is S(x) in Ὁ. 

An (F)-analytic function has differentials of all orders. In the present case 


5S(x3h) = δα) hS(x) (9.2.14) 


and the expressions for the differentials of higher order may be read off from 
(9.2.12) where x — a is replaced by ἢ. | 

The preceding discussion was based on the assumption that ® has a unit 
element and all statements regarding the algebraic character, regular or singular, of 
the solutions become meaningful iff there is a unit element. But the functional 
equations (9.2.5) and (9.2.6) do not involve the unit element, nor does our dis- 
cussion of these equations employ the existence of a unit element. Thus all non- 
algebraic results stated above hold for any B-algebra with unit element or not. 

We return to equation (9.2.5). We have found the character of the locally 
holomorphic solutions of the equation in the neighborhood of a finite value 1 = a. 
The existence of solutions 8-holomorphic at infinity is also known. Thus 


ΚΑ; ὦ =ed-2 ΕΣ aan, (9.2.15) 
. n=1 


convergent for |A| > r(a). We have no assurance, however, that all solutions 
%8-holomorphic at infinity are of this type and, actually, this is not true. We can 
replace e by j, an arbitrary singular idempotent, provided 


aj = ja = a. (9.2.16) 


This condition, of course, forces a to be singular. Moreover, we can add a constant 
term g provided 
gq’? =0 and jg=qj=0. (9.2.17) 


Again, a solution of (9.2.5) may exist in a neighborhood of infinity without | 
being 8-holomorphic at infinity. We can have polynomial solutions. Suppose that 
g is a nilpotent in 8 of order p. Then 


RQ) =q —Aq? + λδά ++ + (—ApP 7g?! (9.2.18) 
is a solution. We can generalize this by letting g be quasi-nilpotent rather than 
nilpotent. Thus 


RA) = ¥ (—A)"q"*? (9.2.19) 


n=0 
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is a solution having an essential singular point at infinity. Note that the last 
solution is an entire function of 4. All these solutions, (9.2.15) excepted, are 
algebraically singular for all values of A. The last two solutions are simply (9.2.8) 
fora =Oanda=qg. 

We come now to the problem of finite singularities of R(A). Since the spectrum 
of ae B may be any bounded closed set, we realize that a finite singularity of a 
solution R(A) of (9.2.5)—all resolvents being solutions together with non-resolvents 
without end—need not be isolated at all. Non-isolated singularities are practically 
inaccessible to the methods of analysis, with very rare exceptions, and this is all that 
we can and will say in this case. For isolated singularities there is an elegant 
theorem of M. Nagumo (1936), and this will now be discussed. At an isolated 
point in the neighborhood of which R is single-valued, we have a convergent 
Laurent expansion with coefficients of simple structure. Since equation (9.2.5) is 
invariant under a shift 

| A>A+4, μα μα, 


the singularity may be placed at the origin. Then 


R(A) = ¥ a" = R*(A) + RA), (9.2.20) 
where Κ΄ contains the terms with negative exponents and Κ΄ the rest. We have 
now 


Theorem 9.2.1. If 2 =0 is an isolated singularity of a solution R of (9.2.5), 
then all coefficients in (9.2.20) commute, coefficients with negative subscripts 
annihilate those with non-negative subscripts 


a,a_, = 9, n>0, k>0, (9.2.21) 
so that 
Κ᾽ ΕΑ (pw) = κω R*(A) = 0 (9.2.22) 


wherever these functions are defined. Further, there exist an element ὁ a, an 
idempotent j and a quasinilpotent q, all in 8, such that 


α, Ξ (--1) απ τὶ, n=0, a-y=ja-,=q', k>1, (9.2.23) 
and 

aj=ja=0, qj=jq=4. (9.2.24) 
Here Κ΄ (A) converges in the disk r(a)|A| <1 which has a finite radius unless a is 
also a quasinilpotent. R (A) converges for all A with |A\| > 0. 


Proof. We shall give an argument somewhat different from that of Nagumo. The 
algebra is simpler and the required analysis is elementary function theory. By 
assumption 


(A=W) Σ gd Y ἀμ = Σ ἀμ 


a,,A™ (9.2.25) 


so 
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with absolutely convergent series. Since (9.2.5) is unchanged when / and wu are 
interchanged, we conclude that the coefficients commute. Compare coefficients of 
A” u" on both sides. If 


m#0,n #0, Am~14n — Am&y—1 = 9, (9.2.26) 
m=0,n 40, A τα, -- αρᾶ;,--1 = An» (9.2.27) 
m=0,n = 0, A_1ayg — Apa_, = 0. 


Since we already know that the coefficients commute, the last equation gives no new 
information, nor does the case m # 0, n = 0 add to our information. 
Since R*(A) must have a positive radius of convergence, there exists an 
A,0 < A < οὐ, such that 
la,|| « 4, nO. (9.2.28) 


On the other hand, the power series for Καὶ must converge for all A, 0 < [1]. This 


requires 
lim |la_,||’* = 0. (9.2.29) 
ko 


We now consider (9.2.26) where we replace m by m+1, 0<™m, and take 
n= —k,0<k. The result 15 


Ω, A—K = Am4+14—-K-1 = Im+24-K-2 = °° = ἀρ ρα. καρ 


for all p. As p— oo the last member goes to zero by (9.2.28) and (9.2.29). It 
follows that (9.2.21) holds and hence also (9.2.22). | 
The rest is easy. For n > 0 formula (9.2.27) gives 


ay, = — Aga, -—1- 


This shows that we can take a) = a and obtain the first of the formulas under 


(9.2.23). 
Next we look at (9.2.27) again, now for n <0. We get first 


(a_,)” =a_, 
so that a_, is an idempotent. This is our j. For k >1 
a_,Aa_,=Q@_, OF α. τε α.κ. 
We now use (9.2.26) with m = —k, n = —1 and obtain 
Ω. κα. χ A_y-y_-4 = A_y-4J = A_x-1. 
Hence setting a_, = q we get 
as ie l<k. 


Since the series for Καὶ must converge for all 2 ζ 0, it is seen that qg is either a 
nilpotent or a quasinilpotent. If gq is a nilpotent of order p, then R~ has a pole of 
order p at the origin, otherwise A = 0 is an essential singular point. Ε 
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It is worth observing that a solution R(A) of (9.2.5) with an isolated singularity 
at Δ = 0 is determined completely by three elements a, j, and 4 which act as 
coefficients of 4°, 4741, λ 2, respectively. They must satisfy (9.2.24) and, in 
addition, j must be an idempotent, g a nilpotent or quasinilpotent. Otherwise they 
are completely arbitrary. 

It is appropriate at this juncture to devote some attention to the case of a 
B-algebra without a unit element where we are dealing with dissolvents rather than 
resolvents. We recall the basic relation (9.1.26) 


x + AD(A, x) = xD(A, x) = D(A, x) x. (9.2.30) 


We can subject this relation to the same type of analysis as applied to the resolvent 
relation (9.2.1). If 4 and μ are in 6(x), we can eliminate x and obtain the first 
dissolvent equation 


(A — ἡ D(u) D(A) = AD(A) — wD(y), (9.2.31) 


where we have suppressed x, which plays no further role for this equation. 
Similarly, if 2 ε 5(x) ὦ 6(y), we can eliminate ὁ and obtain the second dissolvent 
equation 


E(y) (y — x) E(x) = E(y) x — yE(x), (9.2.32) 


where Δ has been omitted. In the first equation the mapping is from C into 8, in 
the second from 8% into itself. 

Suppose that D(A) is defined in a domain A which is at a positive distance from 
the origin and that « ε A. Suppose that ||D(A)|| < M in A and set D(a) = a. Here 
ais the initial value of the solution and may be any element of %, reversible or not. 
From (9.2.31) we obtain 


A— α 


λ 


D(A) = =a Ἑ aD(2) 
and by repeated substitution 


α λ--ἀα 


α a (λ-- αλητῖ λ-- a\" 
DO ΞΞ —— --- 2 ware τς NO ees ἢ n 
(A) ee 7 a* + +2 (=) a+ [5-.5} ee. 


Hence, if ie 


A 


r(a) <1, (9.2.33) 


the remainder goes to 0 and we get the expansion 


D(A) = ΤΣ (- 5 =) a (9.2.34) 


The domain of convergence given by (9.2.33) is a circular disk or a half-plane or the 
exterior of a circle according as r(a) is >1, =1 or <1. 
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A solution of (9.2.31) may be holomorphic at infinity and vanish there as 
shown by (9.1.25) with 


DG, x) = — Σ᾿ xa" (9.2.35) 


Again a solution of (9.2.31) need not be holomorphic at infinity. The technique 
used in proving Theorem 9.2.1 will also give 


Theorem 9.2.2. Every solution of (9.2.31) which is ®-holomorphic in 
0O< R<|A| < οὐ is of the form 


D(A) = y Qar+jt Σ (—1)""ta"A-" (9.2.36) 
n=0Q n=1 
with 
aj=ja=0, j*=j, qj=jg=4 1451“ -. 0. (9.2.37) 
Proof. Set 


D(A) = ¥ a, 


The convergence assumptions for this Laurent series imply that 


lim [1,5 =0, |la_,| <A, Vk,k>O0 (9.2.38) 


no 


for some A, R< A< oo. Substitution of the series in (9.2.31) shows that the 
coefficients commute and leads to the relations 


Qn —14n = Ann —15 m#0,n 50, (9.2.39) 
Ω͂,,.--ταρ — AmA-1 = An-1, MF#O,n=0. (9.2.40) 
If in (9.2.39) we replace m by m - 1, m 0, and setn = —k, 0 <k, we get 
Any = Ams τας κ- 1 = An 4 24-2 = τ᾽" = Ans pA_p—p. 


This goes to zero as p > 00 by (9.2.38). Thus the coefficients with negative sub- 
scripts annihilate the coefficients with non-negative subscripts. 
Next, from (9.2.40) with m = 1, 
Ay” = SO a =j, 
an idempotent. Again, from (9.2.40), 
Am—-14g9 = Am, OT Jan = Anj = An, m = O. 


We set a; = g where the notation indicates that q is expected to be a quasinilpotent 
Then from (9.2.39) with m =1, n >1, we get aga, = a,a,_, so that 


Baie i — yn 
Qa, = Ja, = a,-, OF ὦ, =q. 


It remains to analyze the coefficients with negative subscripts. It is already 
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known that a_, is annihilated by a,,,0 < k,0 < m, and this condition implies and is 
implied by 
ja_, = a_,j = 9, k>0, 


since 4 = jg = qj. We again resort to (9.2.40) where we set m = —k and obtain 
Ωα. κ--1 = —A_1;a_,. 

We set a_, = a and obtain, finally, 
a_, = (-- 1) 1a". 


This is the last identity to be proved. The convergence assumptions show that g is a 
quasinilpotent. If gq is a nilpotent, then D(A) has a pole at infinity, otherwise an 
essential singularity. Jj 


Note that (9.2.35) 15 a special case of (9.2.36) for 7 =0,q =0,a= -- χ. 

There is obviously a problem concerning the nature of finite isolated 
singularities of solutions of (9.2.31). This appears more difficult than either the 
infinite case or the finite case for (9.2.5). This question is left to the reader for 
consideration. 

Equation (9.2.32) presents further complications which are largely due to the 
quirks of the underlying B-algebra which are apt to interfere with the proposed 
argument. We refer to the Exercise below for some of these difficulties. Here we 
note merely that if E is (F)-analytic in some domain D of 8, then its first differential 
satisfies 

xdE(x;h) = E(x)h — E(x) hE(x). (9.2.41) 


EXERCISE 9.2 


1. Why is (9.2.9) the form of the continuation of R(A) in A? What is the first step in the 
argument? 


. Same question for (9.2.13). 
. What is the expression for 6 S(x; h), n >1? 
. Justify (9.2.16) and (9.2.17). 
. If R, and R, both satisfy (9.2.5) and if 
Κιὶ Row) = Κρ) Κι = 0, 
for all relevant A and μ, show that R, + R, is also a solution. 
6. Show that R* and R™ are both solutions of (9.2.5). 


7. If R is a solution of (9.2.5), show that R’(A) = —[R(A)}’. Express the higher 
derivatives in terms of R(A). The differential equation w’(z) = —[w(z)]*, where z 
and w are complex numbers, is a special case of Riccati’s equation. 


8. Derive (9.2.31). 


Mn & Ww NHN 
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9. Derive (9.2.32). 


10. If D is a solution of (9.2.31), show that 
ρ = [DMP — DO). 
The classical analogue is also a special case of Riccati’s equation. 


11. In (9.2.34) set a = j, an idempotent, and sum the series in closed form. Can this 
expression be the dissolvent of some x € 8? 


12. Verify (9.2.41). 


13. Consider the commutative convolution algebra L,(— 72, 7) without unit element and 
consider (9.3.32) in this setting, 
Gx(f — g)*F = στῇ — g*F, F= E(f), G = Eg). 
Let f, 4. F, G have the Fourier coefficients { f,}, {g,,}, {F,$, {G,} respectively, where 
each sequence is an element of /,. Find what relations hold between the Fourier 
coefficients, if the equation is satisfied. Suppose that {f,}, {g,} and {G,} are given 
subject to the conditions 


IG,| <klg,|, Καὶ fixed, O<k <1, |f,-—g,| <1, Va. 
Show that {F,} is uniquely determined in /, and |F,| < [k/(1 — /)]| AI, Vn. 


14. Let Q be a 3 by 3 matrix with Q? ~ 0,Q° = 0 and consider all matrices of the form 
a + BQ?, where a and β are complex numbers. These matrices form a normed 
subalgebra 8 of Dt, without unit element. Equation (9.2.32) is to be studied in B. 
Let (a, BP) > f(a, B) be an arbitrary mapping of C x C into C and set 
δ) = f(a, B) L? for all L eB. Show that this is a solution of (9.2.32) which may 
be wildly discontinuous. If it is locally bounded, show that L6(L) will be continuous. 
If 6(L) is continuous, show that J&L) is (F)-analytic even though &(.L) is not. 


9.3 GELFAND’S REPRESENTATION THEOREM 


In 1941 I. M. Gelfand presented an elegant method of representing a commutative 
B-algebra with unit element by means of residue class algebras modulo a maximal 
ideal. Such algebras have important properties. In the following, algebras will be 
commutative, with unit element, until further notice. 

Any B-algebra contains linear subspaces known as ideals. A set t is called an 
ideal of 8, if x, yetimply (1) x + yetand (2) xB ct. The ideal may be trivial: the 
null ideal containing only the zero element and the unit ideal which is all of 8. An 
algebra is said to be simple if these two are the only ideals admitted. An example of 
a simple algebra is furnished by C, the field of complex numbers. If 8 is not 
simple, then there are other ideals called proper. Thus if 8 contains a singular 
element a 0, then the set aB is a proper ideal. It is an ideal for if x = ab, y = ac 
are in the set, so is x + y and xB -- aB. The ideal is proper for it contains a # 0, 
so it cannot be the null ideal, and if it were all of 8, then there would exist a ὁ such 
that ab = ba =e. This is absurd since a is singular by assumption. _ 
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An ideal m is said to be maximal if m is a proper ideal and properly contained 
in no other proper ideal. 


Lemma 9.3.1. An element x of ® is regular iff it belongs to no maximal ideal. 


Proof. Suppose that x is regular and belongs to a maximal ideal. Then xx~ 1 = @ 


belongs to the ideal and the latter must be the unit ideal. This is a contradiction. 
Hence a regular element can belong to no maximal ideal or for that matter to any 
ideal. Conversely, if x belongs to no maximal ideal, then the assumption that x be 
singular would lead to a contradiction. For then x8 would be an ideal containing x, 
and this ideal is either maximal or a proper subset of a maximal ideal. Again we 
have a contradiction. fj 


Corollary. If Ἔ is simple, then all elements are regular excepting the zero 
element. 


For there are no proper ideals, hence no singular elements besides zero. 
This result may be strengthened, but we shall first insert a remark concerning 
the topological structure of a maximal ideal. 


Lemma 9.3.2. A maximal ideal in a B-algebra is a closed point set. 


Proof. For mcm and πὶ is also an ideal (why?). There are two possibilities: 
m = morm = %. The second possibility must be rejected for it would imply that m 
is dense in % in particular, in the sphere ||x — al| « 1, all the elements of which are 
regular. As we have seen, no element of a maximal ideal can be regular. Hence 
πὶ =m or m is closed. ἢ 


We can strengthen the corollary under Lemma 9.3.1 as follows. 


Theorem 9.3.1. A simple commutative B-algebra with unit element over the 
complex field is isomorphic to the complex field. 


Proof. If χε 8, then o(x) cannot be void and there is at least one value A, for 
which Aye — x is singular. For the assumption that o(x) is void would lead to the 
conclusion that the resolvent is 8-holomorphic in the extended plane. Only a 
constant can have this property and, since R(A, x) is 0 at infinity, R(A, x) would be 
identically zero and this is absurd. Hence there is a λὸ such that Ape — x is singular. 
Since the zero element is the only singular element of a simple algebra, we have 
Age — x =O or x =Ape. Note that the spectrum of x contains one and only one 
point. Thus there is a 1-1 correspondence between 8 and the complex field C and 
this correspondence is an isomorphism (why?). Ε 


We recall that an isomorphism takes sums into sums and products into 
products. | 
| The existence of proper ideals in non-simple commutative algebras 8% was 
shown above: for any singular element x ε 8 the set xB is an ideal. The existence 
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of maximal ideals is another matter. Here Zorn’s Lemma is commonly invoked. 
If a given ideal i is not maximal, then it is properly contained in a larger ideal i,. 
In this manner a partial ordering by inclusion is defined for the ideals of 8 and 
Zorn’s Lemma establishes the existence of a maximal element of such a partially 
ordered collection which is the maximal ideal containing t. 


Example I. We shall give an instance where the ideal under discussion turns out 
to be maximal. We take the commutative B-algebra C[0, 1] of functions f con- 
tinuous in [0,1]. Let to be a point of this interval, Ὁ < t) <1, and set 


w= LA f(to) = 01. (9.3.1) 
This is an ideal since f, g EN, he B implies 
f+gEenN, fheN. 


Moreover, 9t is a maximal ideal. For if Jt would be a subset of a proper ideal i and 

if getOR, then g(t) ΞΞ λο; some number #0. Let A(t) = Ay — g(t), then 

heN and 
Ag = [An -- 4] + 4 εἰ. 


This says that some constant multiple of the unit element is in i and this implies 
t= B so that MN is maximal. 

Now suppose that 8 is a commutative B-algebra with unit element e, not 
simple, and let m be a maximal ideal of 8B. The elements of % fall into equivalence 
classes with respect to m. Here x, and x, belong to the same equivalence class iff 


xX, —X,Em. 


In this connection the equivalence classes are called residue classes of 8 modulo τι. 
Each class X is of the form 
A =x+Mm, (9.3.2) 


1.6. X is the set obtained by adding the fixed element x to the elements of m. The set 
of all residue classes X will form the so-called quotient algebra B/m. This set is an 
algebra, actually a B-algebra, under suitable definitions of the algebraic operations 
and the norm. If 

AX=x+m, Y=y+m, aec, 


we set 
X+Y=x+y+m, (9.3.3) 
aX =ax+m, (9.3.4) 
XY =xy+m, (9.3.5) 
and define 
|| X || = {inf ||s || ; SE x}. (9.3.6) 


The reader should verify that these definitions satisfy the A-, S-, and M-postulates 
of Sections 2.1 and 2.6, that (9.3.6) defines a norm satisfying 


IXY] < |X} ΤΥ (9.3.7) 
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and that, moreover, 8/m is complete in this metric. We have now the important 


Theorem 9.3.2. If mis a maximal ideal in 8, then the quotient algebra B/m is 
isomorphic to the algebra of complex numbers. 


Proof. It is to be shown that 8/m has no proper ideals. Suppose, contrariwise, 
that 3 is a proper ideal in 8/m, 3 4 {0}. Then S contains residue classes 
X =x +m and not merely the zero class 0 + m. Let It be the union of all the 
elements of 8 which belong to the residue classes that together make up the ideal S. 
It is claimed that Mt is an ideal of 8 which contains m which implies that Mt = B, 
and this in turn gives 3 = 8/m so that J is not proper. To see all this, suppose that 
xe xX and ye Y where X and Y belong to 3. Then x and y belong to Jt and 
x+yeEX + Y implies x + yeM since X + YES. Further, by assumption, 
X(B/m) € 3 and this says that x e Mi implies that xB c Mt. Now the set m is the 
zero element of 8/m and hence necessarily an element of 3 and this gives πὶ < Mt. 
But m is maximal, so Wt = B and S = B/m. Hence B/m is simple and by Theorem 
9.3.1 this implies that 8/m is isomorphic to C. ἢ 


In this manner every maximal ideal m of 8 defines a correspondence between B 
and C in which the element x is mapped on the complex number «. Here « is the 
number such that x belongs to the residue class we + m ,which is « times e + m. 
Inspection of (9.3.5) shows that the residue class e + m = E is the unit element of 
the algebra ὅπ. Hence x belongs to the residue class «E modulo m and the 
mapping is x -Ὁ «. We now define a functional x(m) on the space Mt of all maximal 
ideals m of ® by the convention that x(m) is the complex number uniquely 
determined by the condition 

x —x(m)eem. (9.3.8) 


This functional now has the following properties: 


Theorem 9.3.3. For eachm 
i) (x + y)(m) = x(m) + yim), χε, Vy eB; 


il) (ax) (mt) = ax(m), VaeC, Vxe8; 
11) (xy) (m) = x(m) y(n), Vxe8B, VyeS; 
iv) e(m) = 1; 

v) O(n) = 0; 


vi) If x is regular, then x(m) 4 0 and (x~*')(m) = [x(m)]7!; 

vii) x(m) € a(x); 

vill) r(x) = sup |x(m)|; 
ix) If Ay € o(x), then there exists a maximal ideal mg such that x(mo9) = Ao; 
x) If x(m) = 0 for all m, then x is nilpotent or quasinilpotent. 


Proof. The first three properties are simply a restatement of the isomorphic 
properties of the mapping from x to x(m) implied by Theorem 9.3.2. Properties 
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(iv) and (v) are implied by e and 0 being idempotents. For if 


j’=j, then [j(m)]* = j(m), 
so that 
iim) =0 or 1. (9.3.9) 


Since 0: x = 0 and ex = x, the choice is categorical: 0(m) = 0, e(m) = 1. 

If x is regular, then xx~! = ὁ gives x(m) (x~*) (m) = 1 and (vi) follows. For 
(vii) we note that x(m) =a for a given m implies that x = ae (modm) or 
x —aeemorx — ae is singular. Hence « € o(x). Since r(x) is the maximum of the 
absolute value of any element of a(x), it is seen that (vii) implies |x(m)| < r(x). The 
stronger assertion made in (viii) follows from (ix), which we now proceed to prove. 
Suppose that A) € a(x) so that λο6 — x is singular. We can then form the ideal 
(Ape — x) 8 which contains Aye — x. If this ideal is maximal, we are through; if 
not, then we can embed it in a maximal ideal ng and x = Ape (modmy) and 
x(1M19) = Ap. This implies (ix) and hence also (vill). Finally, if x(m) = 0 for all m, 
then by (ix) o(x) = {0} and the resolvent 


οΌ 
R(A,x)= Σ᾽ xn? 
ἿΞ 
converges for all A 4 0. If the series breaks off, then x is a nilpotent, otherwise the 
convergency condition requires that x is a quasinilpotent. ἅ 


We return to Example 1 for illustration and inspiration. If ἔρ 15 given, 
0<t) <1, andif 
m = {f3f(to) = 0), 


the residue classes modulo m are now simply 


aE = {ff (to) = a}. 
Hence we have x(m) = f(m) = f(t.) and the reader will have no difficulty in 
verifying that the mapping f — f(t o) has all the properties (i) to (x). Incidentally, 
the zero element of C[0,1] is the only nilpotent and there are no proper 
quasinilpotents. 

In this case the maximal ideals and the residue classes are determined by the 
value of the element f at a preassigned point t = fo. Now the map from 7 to /(fo) 
determines a linear, multiplicative, and bounded functional on C[O,1] and this 
raises the question of the relations between such functionals, on the one hand, and 
maximal ideals, on the other. Here we note 


Theorem 9.3.4. Let 8 be a complex commutative non-simple B-algebra with 
unit element and let u(x) be a linear, multiplicative bounded functional defined 
on 8. Let N be the nullspace of μ. Then Nis a maximal ideal. Here the set of 
all x ε B for which (x) has a fixed value « is the residue class ΧΕ modulo Nt. 
Conversely, the elements of a maximal ideal in ® form the null space Nt of a 
linear multiplicative and bounded functional on 8% which takes the value « on 
the residue class ΧΕ modulo τι. 
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Proof. Given a non-trivial linear multiplicative bounded functional u on 8, its 
nullspace δὲ is certainly a proper ideal since 


μία + y) = w(x) + wy), μίαχ) = a(x), ᾿ 
(9.3.10) 


u(xy) = w(x) u(y), μ(6) = 1. 


If N is not maximal, then there exists a maximal ideal m containing Jt. Suppose 
that x) em but is not in M, so that u(x) =2) #O. Then Age — x, EN and 
a fortiori belongs to m. Hence 

Age = (Ape — Xo) + Xo EM 


or m contains regular elements. This is impossible, so 9t must be maximal and the 
residue class aE (mod 9) is precisely the set of elements x with p(x) = α. 
Conversely, if m is a maximal ideal in 8, then the mapping %8/m — C 15 an 1so- 
morphism and hence determines a linear, multiplicative and bounded functional 


u(x) = x(m) (9.3.11) 
by Theorem 9.3.3. JJ 


An important application of Gelfand’s theorem is to spectral analysis. 
Properties (vii) and (ix) of Theorem 9.3.3 state that the value of x(m) = p(x) 
belongs to the spectrum o(x) of x and, moreover, that all the spectral values of x 
can be reached by varying the maximal ideal m. There is one obnoxious restriction, 
however: the underlying algebra must be commutative. Gelfand has shown how to 
get around this difficulty. We embed the element or elements under consideration 
in a commutative subalgebra of 8% which is large enough so that spectra are 
unchanged. This means that the subalgebra must be closed under inversion so that 
if a regular element belongs to the subalgebra, so does its inverse. 

Let X;, X2, ..., X, be elements of 8 which commute and form the subalgebra YW 
generated by the elements 

CR oak. 


Then 2 consists of all polynomials in these elements. 2 is obviously Abelian 
(= commutative). Next we form °, the first commutant of 1. This is the set of 
all elements of 8 which commute with x,, x2, ..., x, and hence with all elements of 
WY. We have, of course, “1 < Y° < B. It is easy to see that 91“ is also an algebra 
with unit element e, but 2° need not be commutative. To get out of this difficulty 
we repeat the procedure and form 


(Υ = UN, (9.3.12) 


known as the second commutant of 2. Let us note that if WW, and 91, are sub- 
algebras of 8 and if 9, < W,, then W,° c W,°, for the elements of 8 which 
commute with those of 2, must commute with those of 2, but the converse does 
not necessarily hold. From 2l < A‘ we thus conclude that QI°° < Y°. Again, QI°° 
is an algebra with unit element. It is closed in the normed metric and so is 91“. 
Now 2° has the added property of being Abelian. This is seen as follows. If x and 
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y belong to Y° and if z is any element of 9315, then xz = zx by the definition of QI°°. 
Since 2° < 91“ we can take z = y and see that x and y commute, so that A“ is 
commutative. Furthermore, if x is a regular element of 8 and belongs to WI and if 
z is any element of I°, then 


xz = zx implies zx 1 = x7 1!z, 


so that χ᾽ eM. Thus °° is closed under inversion of regular elements. It 
follows that if x € 2{°° and we — x has no inverse in 8, then it can have no inverse in 
Y°°, and vice versa. Thus the spectral properties of x are the same with respect to 
388 as with respect to the commutative subalgebra QI°°. Gelfand’s representation 
theorem applies to the latter and thus gives the required information about spectra 
for any χε YU. 


EXERCISE 9.3 


1. Verify the postulates (A, S, M) for the residue classes. 

2. Verify that (9.3.6) defines a norm and that (9.3.7) holds. 

3. Verify that for an idempotent j 4 0, e both 0 and 1 belong to the spectrum. 
4. If 2 is the subalgebra generated by e and /, characterize the commutants. 
5. Let P(A) be a scalar polynomial with complex coefficients. Let 


P(a) = Me + Ha + α,5α2 ++ + 4,0" 
be the result of replacing 7 by a, an element of a non-commutative B-algebra with 


unit element e. Show that the spectrum of P(a) is {P(A); A € o(a)}. [Special case of 
the spectral mapping theorem.] 


6. Ina B-algebra 8 there are two operations L, and R, which are particularly important 

for purposes of representation. Here L, x = ax and R,x = xa. Show that the 
spectra of L, and R, in the operator algebra &(%) of linear bounded transformations 
from % into itself are identical with the spectrum of a as an element of 8. 


7. The Jordan operator J, = 3{L, + R,] with J, x = 4[ax + xa] has a spectrum 
relative to (8). Show that o(J,) is contained in the set of mean values 
4[o(a) + o(a)] = {4a + B); α, Beola)}. 
8. The commutator C, = L, — R,, 1.6. ©, x = ax — xa has its spectrum contained 
in the difference set a(a) — o(a), i.e. o(C,) < {a — B; a, BeEa(a)}. 


9. Let a and b be two elements of %, distinct or not, and consider the operator 
S,» = L, R, which maps x onto axb. Show that the spectrum of S, , is contained 
‘in the product set of o(a) and a(b) so that o[S, ,] < ἰαβ; «€ a(a), β ε o(b)}. 


9.4 (F)-ANALYTIC FUNCTIONS AND THE OPERATIONAL CALCULUS 


Let 8 be a B-algebra, not necessarily commutative, with unit element e. By this 
time we have a fairly clear idea of what is meant by a polynomial in x, by exp x or 
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by sin x, etc., where x is any element of 8. We simply take the corresponding scalar 
power series 
fA =HAp taAte + α,}λ" Ἐ Ὁ: (9.4.1) 


and replace ἃ by x, writing a e for ἄρ. Thus 
F(X) = Me + αιχ Hee Ὁ αχῇ A ee. (9.4.2) 


If fis an entire function, no restriction need be put on x, but if (9.4.1) has a finite 
radius of convergence R, we require a limitation on the spectral radius of x: 


r(x) < R. (9.4.3) 


It is not difficult to show that (9.4.2) defines an (F)-analytic function; this will be 
proved in a more general context below. 

We can look at the problem of defining analytic mappings from 8 to % as an 
extension problem. The complex plane can be embedded in 8 since 8 has a unit 
element and all complex multiples of e belong to 8. Suppose now that on some 
domain A in the intersection of 8 with the complex plane we have defined a function 
f(Aje where f is Cauchy analytic for λὲ A. This function is to be extended to 
adjacent parts of 8 in such a manner that some type of analyticity is preserved. The 
passage from (9.4.1) to (9.4.2) illustrates such an extension preserving Fréchet 
analyticity. This can be generalized. We start with a local point of view. 

Suppose that A, Ε A and that 


f(A) = YF (Ao) (2 = Jo) 9.4.4) 
for |A — do] <p. We then set 
70) = Ff (o) (4 = hoe)? 9.4.5) 
for x in the set 
S: {xsr(x — Age) < p}. (9.4.6) 


An equivalent condition is that o(x) should be located in the interior of the circle of 
convergence of (9.4.4). We refer to f(x) as the principle extension of f(A) from the 
disk |A — A,| < p to the set S. We shall sum the series (9.4.4) into a form which 
suggests further extensions. 

Let 0 < p, < p, <p. Then 


l f(u) du 


] 
-- ΝΘ)(1,} = as 
where I is the oriented circle |u — Ay| = ρ so that for r(x — Age) < ρ, 
Ϊ 2 (x — Age)" 
eats eas 0 ΘΗ 4. 
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The series converges in norm for such μ and x to the sum R(p, x) so that 


Ι 
fo) τ = | FW) RW») de (9.4.9) 


By Cauchy’s integral 1 
£0) = 55 | fw -- 25’ du (9.4.10) 


Thus the extension of f(A) from the disk |A — λοὶ < p to the set S is obtained by 
replacing the Cauchy kernel by the resolvent R(y, x). The latter is, of course, the . 
principal extension of the Cauchy kernel to the B-algebra. In this discussion the 
restrictions imposed on Γ and on x may be substantially modified. This leads to 


Definition 9.4.1. Suppose that f(A) is Cauchy holomorphic in a simply connected 
domain A and is continuous inside and on its oriented simple rectifiable boundary 
I. Suppose that x € B is such that o(x) < A. For such f and such values of x we 
define the principal extension of f(A) to 8B by formula (9.4.9) where now T is the 
boundary of A. 


Actually we can relax the assumptions. By using more general forms of 
Cauchy’s integral we can dispense with the assumption that A is simply connected. 
In some applications it is convenient to allow open sets A which are not connected 
but have a finite number of connected components in each of which f(A) is supposed 
to be holomorphic, though f is only piecewise analytic in A. In this context the 
boundary I consists of a finite number of simple rectifiable curves suitably oriented. 
These refinements are just mentioned for the reader’s information; we shall find it 
convenient to stick to Definition 9.4.1 in the following. 

The definition establishes a 1-1 correspondence between two classes of 
functions, F and H. Here F is the set of functions f given by Cauchy’s integral 
(9.4.10) with Γ taken as the boundary of the domain A. They are mappings from C 
to C, Cauchy holomorphic in A and continuous in AUT. H, on the other hand, 
consists of mappings from 8 to 8 defined by (9.4.9) for values of x such that 
a(x) < A. This correspondence and these mappings have a number of interesting 
properties, some of which will be discussed in the following. 

The function f(A) is Cauchy holomorphic in A, its image f(x) is locally 
(F)-analytic in the set 


Sar {x3 a(x) « A}. 


Now Cauchy’s integral (9.4.10) defines a function holomorphic in A, and for this to 
be true it is sufficient that fis given as a continuous function of arc length on I. 
No assumptions are needed in the interior of A. The analyticity of the integral with 
respect to A resides in the analyticity of the Cauchy kernel (u — 4)~* with respect 
to A. In the same manner, the analytical properties of integrals of the form (9.4.9) 
reside in those of the resolvent Κίμ, x) as a function of x. The discussion in 
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Section 9.2 shows that R(y, x) is locally (F)-analytic. Now we can rewrite (9.2.12) 
as 


πιὰ eG RGD RGD y Cah R(t, x)" (9.4.11) 


and have to examine what limitations have to be put on the arguments a, μ, h, and x 
for the formula to hold in the present context. Here the main requirement is that 
the series converge in norm for we can then show by direct substitution that the 
second resolvent equation is satisfied in the form 


aR(u,x + ah) hR(p, x) = R(u, x + ah) — Κ(μ, x). (9.4.12) 


Since x € S, we have o(x) < A. We keep x fixed. Then R(x, x) is a B-holo- 
morphic function of 4 in the complement C(A) of A and this function tends to zero 
as t— 00. Hence there exists a finite M = M(x) such that 


IR(@u,x)i| <M, pwecCfA). (9.4.13) 


Suppose now that ae C, he B® are any quantities such that |a| M ||h| <y <1. 
It is then seen that the series (9.4.11) converges in norm, uniformly in the 
arguments for 

weC(A), fa<1, 11! «γμ'΄, (9.4.14) 


and for such values the sum of the series is dominated in norm by 
Μ — y)7?. (9.4.15) 


Thus the sum of the series represents a solution of (9.4.12) for the stated values of 
the arguments and the sum of the series is the resolvent of x + ah atA = μ. Thus 
R(u, x + ah) exists and is bounded in C(A) for «, ἢ and x as indicated. We can now 
prove 


Theorem 9.4.1. S, is an open set in 8B, possibly not connected. For any fe F, 
any x € Sa, [α] <1,he Bwith ||h|| < y[M(x)]~’, the corresponding function f(x) 
is locally (F)-analytic and 


f(x + ah) = f(x) + Σ at | WRU x) hR(p, x)... AR(u, x) du. (9.4.16) 


There are k +1 factors R(p, x) in the coefficient of αὐ alternating with k factors ἢ. 
For fixed x the series converges in norm and uniformly with respect to « and h, 
restricted as indicated. 


Proof. By the preceding argument Κίμ, x + ah) exists for με ((Δ) when α, ἢ 
and x are restricted as stated. This asserts that the spectrum of x + ah is confined 
to Aso that x + ah € S,. This implies the existence of a sphere in 8 with center at x 
all the points of which are in S,, so that G, is an open set. One of the components 
of this open set contains the set of elements {Ae; Ae A}. However, we have no 
assurance that this is the only component. 
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Now since x + ah € Sy, we have by definition 


Ϊ 
flee + αἱ) = = | rw Αἰ a 


Here we can substitute the series (9.4.11) and integrate termwise with respect to μ 
since the series converges absolutely and uniformly. This gives (9.4.16). ἢ 


Definition 9.4.2. Α linear mapping T of the class F into a class G of functions g 
defined and locally (F)-analytic in S, shall be said to be continuous if g = ΤΙ f] 


and 

lim sup [f,(4) — f(A)| = 0 (9.4.17) 
implies 

lim TL AM) = TL) (9.4.18) 


locally uniformly in x. 


Here “locally uniformly”’ refers to the possibility of S, having infinitely many 
disjoint components. The convergence is to be uniform in each of them. We now 
have Gelfand’s Uniqueness Theorem. 


Theorem 9.4.2. The mapping T of F into H, defined by (9.4.9) and (9.4.10), is a 
continuous isomorphism taking | and A into e and x, respectively, and it is the 
only linear mapping with these properties. 


Proof. Itis clear that T maps 1 and λ into ὁ and x, respectively. The mapping 15 
1-1 for if the elements /, and /, of F are distinct, then T[/,](x) and T[f,](x) are 
distinct for x = Ae for J in an open non-void subset of A. Hence ΤΙ 7 # Τ[ 1. 
It is clear that the mapping is linear so sums go into sums, but it is not clear that 
products go into products. To prove this form 


1 
ACA) = 5— [A (ἡ RW, x) dus [. RGYROD Gs 


Here I’, and Γ, are at our disposal, subject to two conditions: they must surround 
o(x) once in the positive sense and they should be confined to AUT. We can use 
this freedom of choice to take [5 = I and I, interior to T. We can then write the 
μῶν of the two integrals as a double integral 


at Ι. [ AiO 0) Κώ, x) ΚΟ, x) du dv 


- πα] [λώλὼ 


by (9.2.3). This can be written as the difference of two iterated integrals where the 


R —R 
Ὁ») = RU) τ ἀν 
μ--ν 
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order of integration, which is immaterial, is arranged to suit our convenience: 


= | 20) Κρ, x) πὶ [ LO in| dy 


Jrib ey 


τε], 00 RO fags gee ae 


Here the inner integrals are numerical Cauchy integrals. The first one is 0 since v 
lies outside of I’,, the contour of integration, while the second one equals --  (μ). 


Hence we are left with 
2πὶ 


[ ΓΙ ΠῚ 


which is T operating on the product of f, and f, in F. Thus 
TL fo] =TLAITLAI (9.4.19) 


and products go into products. 
As a byproduct of this argument we note that 


A MAC =hOA), (9.4.20) 


so that ἢ and f, commute. On the other hand, no information is available whether 
or not f,(x,) and f,(x,) commute for distinct values of x, and x. 

Thus T is 1-1, linear, and maps products into products, so it is an isomorphism. 
It should also be proved that T is continuous in the sense of Definition 9.4.2. 
We form 


| 
A) πλῷ = 5 | LAW = L601 RW, ») de 
Suppose that for a given ¢ > 0 and for n > k we have 


IFA -AAL<6, AeA 


By continuity this estimate also holds on Γ and for μι on Γ we also have 
|R(u, x)|| < M(x) by (9.4.13). Hence 


I f(x) — fC) || < eT) M(x), 
where {{Π} is the length of T. Moreover, if (9.4.14) holds, we have 
f(x + ah) — f(x + ah)|| < εἐ( — γ᾽ ΜΟ.. (9.4.21) 


This shows that (9.4.18) holds locally uniformly in S,. 

Finally uniqueness of the mapping should be proved. It is to be shown that 
(9.4.9) is the only way of defining a mapping from F into a set of functions from B 
to 8 defined and locally (F)-analytic in S, if the mapping is to be a continuous 
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isomorphism taking 1, A into e, x. Suppose that Tp is such a mapping. Since To is 
an isomorphism, a polynomial in λ goes into the corresponding polynomial in x, 


PA) τ αο + Σ᾿ uA‘ > P(x) = ae + Σ᾽ %x*. 
k= 4 k=1 


Since 
(A — Ao)” > (x — Age)", 


this implies, by the continuity property, that formula (9.4.5) is a consequence of 
(9.4.4), which is equivalent to saying that (9.4.10) implies (9.4.9). To start with, this 
holds when o(x) is interior to the circle [μ — λοί = p, which is taken as the contour 
I. But if Το coincides with T for arbitrary circular disks, then, again by continuity, 
Ty = T for arbitrary domains A. ff 


These are the main theorems in the Gelfand Operational Calculus. After the 
appearance of Gelfand’s work numerous additions and modifications have been 
added to the theory. To round out the discussion we shall also prove Gelfand’s 
Spectral Mapping Theorem. 


Theorem 9.4.3. If fe F and χε Sy and if f(x) = TLf](x), then 
ol f(x)] =flo(x)]. (9.4.22) 


Proof. Suppose that «¢€o(x). It is required to prove that f(a) € o[ f(x)] and, 
moreover, that all spectral values of f(x) are of this form. To this end, consider 


4(λ, α) = [fA — f@IA— 0, 


where for 2 = ἃ the indeterminate form is to be replaced by its limit f’(~). Then 
q(A, «) is a Cauchy holomorphic function of ὁ in A and belongs to F since ἢ 1s 
continuous on I. From | 


fA -f@® = ὦ -- oa, α) 
we obtain by the mapping T 
f(x) —f@e = (x — ae) q(x, α). 


Here x — ae is singular by assumption. This makes f(x) — f(ae singular or 
f(a) € oL ΟἹ]. 

The proof of the converse requires a more cautious approach. Suppose that 
Be o[f(x)] but not to f[o(x)]. Then there exists a domain A, with oriented 
rectifiable boundary I’) such that o(x) < Ay c A and the equation f(A) = βὶ has no 
roots in Aj. Set A(A) = β — f(A) and note that both h(A) and [h(A)]* are Cauchy 
holomorphic in A, and continuous in Aj. They then belong to the class F(A,) 
which contains the restriction of F = F(A) to Ap. We write S,, for the set of 
elements of 8 such that o(x) < Ao. This set is open and S,, ΞξῷϑἪ, S,. Formula (9.4.9) 
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with I replaced by ΓῸ now defines the mapping of F(Aj) into the corresponding 
class H(A,). Hence A(A) > A(x) and [A(A)]>* > [A(x)]~* = [Be —f(x)]7 1. This 
is impossible for Be — f(x) is singular by assumption. It follows that no domain Ay 
can exist with the stated properties. Hence the equation 


f(%) = B 


has at least one root in A. ἢ 


EXERCISE 9.4 


1. Under what conditions is exp (x + y) = exp (x) exp (y)? 
2. If j is an idempotent commuting with x, show that 
exp (x + 2π|}} = exp (x). 


3. There are two ways of defining a square root of e — x which reduces to e for x = 0. 
One is by the binomial series, the other by (9.4.9) for a suitable choice of £ Give 
these representations and discuss their validity. 


4. Suppose that o(x) lies in the open right half-plane. Define a square root of x using 
(9.4.9). Show that the root is locally (F)-analytic (where?) and represent df(x; A) 
by a contour integral. What is the spectrum of the root? 


5. Under what conditions on x can (9.4.9) be used to define a logarithm of x? 


6. If g is a quasinilpotent, show that 
co /1 
at ελτ' τ ΠῚ rot 
n=] 


is a square root of R(A, g). For what values of A does the series converge? 


7. The Cauchy integral of the product of two Cauchy holomorphic functions should 
be the product of the Cauchy integrals of the two factors. Verify this! 


The remaining problems deal with the operational calculus in a B-algebra without unit 
element. D(A, x) denotes the dissolvent of x and a(x) its spectrum. Let A be a simply 
connected domain in the complex plane, containing the origin and bounded by an oriented 
rectifiable contour Γ. F is the class of all functions Κ᾽ Cauchy holomorphic in A and 
continuous in A such that f(0) = 0: ὦ 4 is the set of all x € B such that a(x) c A. Let T 
be the mapping defined by 


Ι d 
TIS) = [οὐ = ἢ f(t) Diy x). 
2πὶ Γ μ 
8. Show that G, is an open set in 33. 


9. Show that 7 defines ἃ continuous isomorphism in which 0 and 1 correspond to 0 and 
x, respectively. Here the first zero is that of C, the second that of SB. 
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10. Show that 7 is the only mapping with these properties. 


11. If B is the convolution algebra L,(— 7, 7), how are the Fourier coefficients of T[/] 
obtained from those of f ? 
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10 LINEAR TRANSFORMATIONS 


Linear operators have been with us almost from the beginning of this treatise. 
The time has come to tie various loose ends and supplement the earlier discussion, 
especially that of Chapter 2. Extensions of linear operators, the Banach—Steinhaus 
theorem, and the closed graph theorem are among the neglected topics which will 
now be considered. So far we have restricted ourselves to linear bounded operators. 
In this chapter unbounded operators will also be admitted provided they are closed. 
Various linear bounded functionals have occurred earlier. Here the missing 
existence proofs will be provided. 

While inverse operators have been considered in many places, we shall now 
also deal with adjoint operators. They will play an important and more spectacular 
role in the next chapter. In a Hilbert space, adjoint operators are much more 
manageable and easier to visualize than, in general, Banach spaces. Finally, we 
must specialize the discussion of B-algebras in Chapter 9 to the case of operator 
algebras of the type €(X). Since we shall allow unbounded operators, it will be 
necessary to mention briefly how spectra and resolvents may change in the new 
setting. 

There are five sections: Boundedness; Closure; Linear functionals; Inverses 
and adjoints; and Spectra and resolvents. 


10.1 BOUNDEDNESS 


We start by recalling the basic properties of linear transformations as defined in 
Chapter 2. The reader is advised to review this chapter. 

We restrict ourselves to linear mappings from one B-space X to another Y, 
both over the complex field. The domain D(T) of T is a linear subspace of X, 
possibly X itself, and the range R(T) of T is a linear subspace of Y), possibly ἢ) 
itself. 


Definition 10.1.1. T is a linear mapping (equivalently a linear transformation 
or a linear operator) from X to Ἢ if D(T) and R(T) are linear subspaces of 
ἃ and Y, respectively, and if for all x,,x,€D(T) and all a, BEC 


T (ax, + Bx.) = aT (x,) + BT(x,). (10.1.1) 
This is a generalization of Definition 2.2.2. 
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A linear transformation T maps the zero element of X on the zero element 


of ἢ, 
T (0) = 0. (10.1.2) 


There are possibly other elements x of Κ that are annihilated by T. The set of all 
such elements x is the nullspace N[T] or kernel of T 

NET] = {x; T(x) = 0}. (10.1.3) 
It is clear that N[T] < D(T) is also a linear subspace of 3. 


Definition 10.1.2. A linear transformation T from X to ἢ is bounded on D(T), 
if there exists a constant M such that for all χε D(T) 


IT) < ΜΊΧΙ. (10.1.4). 


The least value of M for which the inequality is true is called the norm of 
T on D(T), written ||T']. 
The norm when it exists may be defined directly by 


[ΤΊ] = sup [TO |; [xl] =1, xe D(T)]. (10.1.5) 


Note that all elements of X not = 0 are positive multiples of elements on the 
unit sphere |}x|| = 1. 

All linear bounded transformations with D(T) = αὶ and R(T) < Y form 
the set €(X, Y) with ©(X) written instead of ©(X, X). We recall (see Section 2.4) 
that these are linear vector spaces which become B-spaces under the norm ||T'||. 


Definition 10.1.3. Let X x Ἢ be the Cartesian product of X and ἢ, i.e. the 
set of all ordered pairs 


{(x,y); xeX, ye}. (10.1.6) 

The graph © of a mapping from X to Ἢ is the set of all ordered pairs 
{[x, T(x)]; xe D(T)} «- ἃ χ Y. (10.1.7) 

We define a metric on X x ἢ by setting | 
Cx, yl] = [Χ}} + ΤΣ]. (10.1.8) 
Note that the norm symbol ||. || refers successively to the spaces X x ὃ), X, 


and ἢ. 


Theorem 10.1.1. The product space X x Ἢ is complete in the metric defined 
by (10.1.8). 


The proof is left to the reader. Actually the space is a B-space if we define 


(Xi, ¥1) + (Xo, Y2) = (Χ; + X2, Yi + Yo); 
a(x, y) = (ax, ay). (10.1.9) 
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Definition 10.1.4. A transformation y = T(x) from X¥ to Y is said to be 
closed if its graph is a closed subset of X x Y. 


Thus T is closed iff 
{x,} < D,x, > X, together with y, = T(x,) > yo (10.1.10) 


implies that 


Definition 10.1.5. If T,(x) and T(x) are mappings from X to Y with 
domains D, and %,, respectively, if D, < Ὁ; and if T,(x) = T,(x) on Dy, 
then T, is called an extension of Τὶ written Τί < Ty. 


Theorem 10.1.2. Let y = T(x) be a linear transformation from D < X to ἢ 
with graph ©. Then T has a closed extension, iff no vector pair (0, y) with 
y # 0 belongs to ©, the closure of ©. If this is the case, then & is the graph of 
the smallest closed linear extension of T. 


Proof. © is linear in X x Y since G has this property. Now G is the graph of some 
transformation, say T°. This transformation is linear and closed since the 
subspace © is linear and closed. Further, suppose that (x, y,) and (x, y,) belong 
to G. Then so does (x, y,) — (x, γ2) = (0, y; — y2) and by hypothesis this 
requires that y, = y>. It follows that T° is single-valued. It is clear that © is the 
graph of the smallest closed linear extension of T. Conversely, if Tis 1-1 on Ὁ, 
no proper extension of T can take 0 into any element but 0. ἢ 


We return to linear bounded transformations. 


Theorem 10.1.3. If T(x) is a linear bounded transformation from D < X to 
Y with norm |\|T ||, then T has a unique linear bounded extension T° defined on 
D and ||T°|| = ||T]. 


Proof. If D is a closed subset of X¥, then the theorem is trivially true 
for T°=T. If D is not closed, the extension is obtained as follows. Let 
X»EDOD, let {x,} c Ὁ and limx, = x . Then, if y, = T(x,), 


Ym — γε = ΠΤ (χ,) — Τίχ,)}} = [71 (ἀν, — XML «|| ΤΊ Xm -- Xall- 


Here {x,} is a Cauchy sequence in X, whence it follows that {y,} is a Cauchy 
sequence in ἢ). Let its limit be yo. 

We have to show that yp is independent of the choice of a Cauchy sequence 
in X. Suppose that {z,} < Ὁ, limz, =x, and w, = T(z,). Since D is linear, 
X, — Ζ,Ε Ὁ and is a Cauchy sequence in ¥ converging to zero. Hence 


IW, τι Y nll = || T (Ζ,) " T (x,)|| = || T(z, ~~ x,,) | < || "|| IZ, χε Xn ll, 


so that {w, —y,} is a Cauchy sequence in %, converging to zero. Here 
lim y, = Yo, 80 we must have lim w,, = yo, 1.6. Yo is independent of our choice of 
Cauchy sequence in X converging to Xo. 
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We can then define 
T°(x,) = lim T(x,), χοε DOD, T°(x) = T(x), χε. 
Further, note that 
l¥oll = | T°Co)l| = lim || T(x,)|] < | TI} lim [[χ,}} = TI [Χο]. 


so that ||T°|| < ||T||. Here we must have equality, since T°(x) = T(x) on Ὁ. 
The uniqueness of the extension follows directly from the facts that D(T) is 
dense and ἢ) is a Hausdorff space. fj 


Corollary. A linear transformation from X to Y whose domain D is dense in 
X admits of a unique extension to all of X with norm unchanged. 


Somewhat similar considerations lead to the Banach—Steinhaus theorem. 


Theorem 10.1.4. Let {T,} be a sequence of linear bounded transformations 
in €(X, Y) such that (1) ||T,|| < M for all n, and (2) lim T,,(x) exists for every 
x in a set ©, dense in a sphere S. Then lim T,(x) exists for all x, and the 
limit defines an element T of €(X,%) with ||T || < lim inf ||T,|l. 


Proof. We start by proving convergence in . Let χρε SOG, let {x,} c &, 
and let lim x, = x5. Then for any choice of positive integers ἡ, k, n we have 


| Τ (Χο) — TeXo) || < ΠΤ (Χο) — Τη(χ,}} + TC) — Tea) l + ITC) — Τ (Χο) Il 


so that 
lim sup || Τ᾽ (Χο) — T,(Xo)|| < 2M ||xo — x, ]. 
j,k 0 


This holds for all n and lim x, = Xo. Hence the sequence {T;(xo)} converges 
and we have convergence everywhere in ©. Suppose that © is the sphere 
{x; |x — Soll <r}. Suppose that x is a point of ¥ exterior to Ὁ. We can then 
find a point y in © and a positive number p < 1 such that 


] Ϊ 
Υ -- 58ὺ Ξ Ρ(Χ --5ο) or Χ-ΊΙ] -- --- [80  --ΟΑ͵ῸΥ. 
ρ ρ 


Since Τ᾽, is linear, 


T(x) = (1 : =) T,(S) + = T,(y). 


Here 80 and y are in ὦ and it follows that the sequence {T,(x)} converges 
everywhere in X and 
T (x) = lim T(x) 


is defined for all x. Here T is obviously linear. It is also bounded for 
{1} < M implies.||T|| <M. Thus Te €(X, ἢ). Now the convergence of {T,,} 
in (X,Y) implies convergence of any subsequence. Choosing a suitable sub- 
sequence, the estimate ||T|| < M may be improved to ||T|| « lim inf ||T,,|). a 
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We end this discussion by adducing three related examples which illustrate 
various points in the preceding discussion. In all three X = ἢ = C[0, 1] with the 
sup norm. 


Example I. Define 
TUS J(t) = | F0s)as O<t<l. (10.1.12) 
0 


Here Te €(X) and |T||/=1. The range R(T) is a proper subset of Y = X, 
namely that subset of ¥ whose elements satisfy the conditions (1) g(0) = 0, and 
(2) σ΄ (1) exists and belongs to X. The first condition implies that R(T) is non- 
dense in ἃ. For if fe X and f(0) =a, [αἱ > 0, then the distance of f from 
R(T) is > lal. 


Example 2. Take 
ULSI) =f’). (10.1.13) 


Here D(T) is a proper subset of X since continuous functions are *““normally”’ 
not differentiable. On the other hand, every continuous function can be approxi- 
mated uniformly by differentiable functions—in fact, even by polynomials. This 
expresses that D(U) is dense in X. Further, U is unbounded in D(T). To show 
this we can consider 


f(t) = sin(nat) with f’(t) = πποοϑβ (nat). 
Here || f|| = 1 identically in while || f’\| = πη. Thus U is unbounded. 
Example 3. With U as in the preceding example, let us note that the unbounded 
operator U is closed in the sense of Definition 10.1.4. We have to show that 
{f,} < Ὁ, f, > fo ε ἃ together with 7, > 90 © ἃ implies that foe Ὁ andfo’ = go. 


The first assumption implies in particular pointwise convergence, so that 
lim f,(0) = fo (0). Further, 


πῶ = flO) + [κ᾿ ds + (0) + [σώ 


by the uniform convergence of the sequences { F,(t)} and {f,’(t)}. On the other 
hand, the third member equals f(t) by assumption. This shows that fo(t) is 
differentiable, i.e. fp € D(U), fo'(t) = go(t) eX and U is closed. 


These examples illustrate the general observation that differential operators 
tend to be unbounded but closed. This observation will be made more precise 
later. The importance of closed operators will be brought out in the next section. 


EXERCISE 10.1 


1. Take X = /, and the shift operator T {x,} = {x,,,}. What is ||7 ||? What is R(T)? 
Is R(Z) dense in ],? 
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2. With ἃ =/, and U{x,}= {nx,}, determine D(U) and R(U). Is U bounded? 
Is D(U) dense in /,? 

3. With the same conventions, is U closed? 

4. Take X = C[0,1] and 7[f]@ = fo’ ( — s) f(s) dsfor0 < ¢ <1. Find norm and 
range of JT. Is (7) dense in X? 

5. With X = C[0,1], take U[f](@t) =f". Is U bounded? Find D(U) and R(V). 
Are they dense in X? 

6. Show that U is closed. 

7. Take X = 1, and consider the vector-valued function 

F(t) = {ὁ “ἢ, 

Here {/,} is an increasing sequence of positive numbers. Show that F(r) € /,, for all 
t > 0. What is |/F(z)|| 2 What additional restrictions are needed on {/,} in order that 
F(t) belong to /,, for all real 12 How should the problem be modified to make sense 
in 1. 

8. Let the functions f,(t) be continuous and differentiable on (0, 00). Is the operator 
U{ (δ = {f(D} bounded on /,, if {4,0} Εἰς for all tr > 0? 


10.2 CLOSURE 


The notion of a closed linear transformation was introduced in Definition 10.1.4. 
Such transformations form a natural generalization of bounded transformations 
and, while unbounded, have enough properties in common with the bounded ones 
to be manageable. Most linear transformations encountered in analysis are closed. 
As a consequence of the definition we get 


Lemma 10.2.1. A linear bounded transformation from X to Y with D(T) = X 
is closed. 


The piers is left to the reader. 

We denote by €(X, WY) the set of all ee closed transformations from 
X to Y and write €(X) for €(X, 3). Here €(X, Y)) is in general not a linear system. 
On the other hand, a closed transformation brings with it an infinite set of 
‘satellites’? together with which it forms an equivalence class modulo “bounded- 
ness’’. 


Lemma 10.2.2. If UE €(X, ἢ) and if T is a linear bounded transformation 
with D(T) > D(U), then U + T is defined on D(U) and is a closed linear 
transformation and as such is a member of ©(X, Y). 


Again the proof is left to the reader. 

It follows that €(X,%) splits up into equivalence classes where two 
transformations U, and U, belong to the same class if D(U,) = D(U,) = D and 
U, — U, is bounded on D. 
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We proceed to proving some theorems which are natural extensions of 
theorems valid for bounded operators. The first is an extension of a theorem due 
to S. Banach. 


Theorem 10.2.1. Let y = U(x) be a linear closed transformation from D> « ἃ 
onto R < Y and let R be a set of the second category in %. Then 


(1) R=; 
(2) there exists an m> 0 such that for every γε Ὃ there is an ΧΕ Ὃ with 
y = U(x) and ||x|| > m|ly|. 


Proof. This involves a number of steps. The first one aims at showing that ® 
is dense in Ὥ. Using the assumption that ® is of the second category we shall 
show that ® is dense in a sphere. Set for n = 1, 2, 3,... 


ἢ, = {x;xeD, ||x|| < 7}. (10.2.1) 
Then 
D=(J)(D,) and R=) VU(D,). (10.2.2) 


By assumption, ® is of the second category in ἢ), hence one of the image sets 
U(D,) is of the second category in ἢ) and 
| Int (U(D,)) « Ὁ. 


A sphere ὦ = {y; lly — yoll < ro} is therefore contained in U(D,). It follows 
from y, ε U(D,) that there exists x;¢D, with y; = U(x,) and lly; — yoll < 47. 
Let 


So = ty; llyll < dro}, 
then, by the continuity of the functions (y, z) > y + z and (A, y) > Ay, 
So <- S—y, « U(D,) — yy - UWD;) — νὼ» 
| < U(D,) — UD,;) = U(D,)). (10.2.3) 
Thus as ΞΕ 
H= UY πῷος UY mU(D,;) ε |} UmD,) « ὲ (10.2.4) 
m=1 m=1 m=1 
and SK 15 dense in Y). 
The second step involves showing that 9ὶ contains a sphere. Here is where 
the property of U being closed comes into play. It will be shown that the 
sphere 


: 
6, = ly: ly <r, = re 
4] 


is contained in 9. Formula (10.2.3) shows that the closure of the set U(2)D,) 
contains the sphere Gj. By an obvious contraction process we see that implies 


U(2"*D,)>2-*S,; k=0,1,2,...: (10.2.5) 


These inclusions relations are the basic tool for the following argument. 
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Let y be an arbitrary point in ©,. Using (10.2.5) with k = 0, we can find a 
point x, €D, such that 
ly — U(x,)|| < 27* 14, 


where ||x;|| <1. Since y — U(x,)€4S,, we can go back to (10.2.5), now with 
k =1, to find a point x, εξ ὦ, such that 


lly — U(x,) — U(x2)|| < 47, 


and here ||x,|| <4. Repeating the process, we find a sequence of points {x,} all 
in D, such that for each n 
“bol 
k=1 


and now ||x,||<2°""*. Ifs, = D%-1x, and U(s,) =t,, we see that lims, = x, 
exists, ||xo|| < 2, and limt, = y. Since U is closed, this implies that χρε D(U) 
and U(x))=y. Now y was an arbitrary point of the sphere ὥ,, whence it 
follows that R > S,. Again, by the homogeneity of U, this implies that 
R= ἢ. 

The last step is to show property (2) asserted in the theorem. This also 
comes out of the preceding discussion. We can take m = 4/r,. If y is any element 
of ἢ, we can find a contraction which will pull y inside the sphere ©, for which 
we have bounded inverse images. Set 


γι =7,(2llyll)~* y. 


Then y,; € ©, and there is an xX, with ||x9|| < 2 such that U(x.) = y,. We have 
now 


s cas δ U(x;) 


k=1 


«2 ΠῚ 


IU (Xo) || = {Ὑ{}} = Ζγ, 2 > ζγ [Χο] 
or 


4 
IXoll > — || UC) 
ry 
as asserted. ff 
Corollary 1. If U~* exists, then it is bounded. 


Proof. By Theorem 2.2.2 properties (1) and (2) are necessary and sufficient for 
the boundedness of U~* when it exists, ie. when the mapping is 1-1. ἢ 


Corollary 2. If y = T(x) is a linear bounded transformation belonging to 
€(X, Y) and if Tis 1-1, then T~'* is bounded. 


Proof. By Lemma 10.2.1 T is closed, so Theorem 10.2.1 applies. Jj 
We come now to the closed graph theorem of S. Banach. 


Theorem 10.2.2. If y = U(x) is a closed linear transformation with domain Ὁ 
of the second category in X, then D = ἃ and U is bounded. 
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Proof. We define a mapping V of the graph © of U onto D by the convention 
V(x, U(x)] =x. (10.2.6) 


Since X x ἢ is a B-space, this is a mapping from one B-space X x ἢ to 
another X. It is clearly linear. Moreover, V is closed for if 


(Χ,» Yn) > (KoYo), then yo = U(X.) and x, — Χρ 


by the closure of U. The mapping is 1-1, for the nullspace of V contains only 
the point [0, U(O0)] = (0, 0). The range of V is D, by assumption of the second 
category in ἃ. We can then apply Theorem 10.2.1 with Corollary 1 and 
conclude that the range of Vis all of ¥. Further, V~* exists and is bounded so that 


Lx, U@&)]I| = IV *@I < M [xl 
or 
IIx] + |U@)|| < M |x| 


by the definition of the norm in X x Y. This shows that U is bounded and is an 
element of €(X, 9). Jj 


As an application of linear closed transformations we give an extension of 
Theorem 7.4.3. There it was shown that a linear bounded transformation can be 
taken under the sign of integration. Here we want to examine under what 
circumstances this may be done when the operator is just closed. In the bounded 
case the integrand (or the integrator) is a vector function which assumes values 
in the domain of the operator. If fis strongly continuous, so is ΤΙ 7]; if g is of 
bounded variation, so is T[g]. In the closed unbounded case the situation is 
different. It is necessary to postulate that the values of f (or of g) are in 
D(U) and that U[_f] (or U[g]) has the properties that will ensure the existence 
of the Stieltjes integral. Since we are more interested in simple results, easy to 
remember and to apply, than in the utmost generality, we impose stronger 
restrictions on f and g than necessary. 


Theorem 10.2.3. Given UE@(X, ἢ). Let f be a function defined on [a, δ] 
with f{(s)eD(U) for a<s<b. If both f(s) and U[f](s) are strongly 
measurable in [a,b], further, if g(s) is a numerically valued function of 
bounded variation in [a, b], then 


U | ] f(s) ag(s)| 


U [ f(s) dg(s)| = [ U[f£](s) dg(s). (10.2.7) 


exists and 


Instead, if both f(s) and U[f](s) are of strongly bounded variation on [a, δ], 
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and if g(s) is a numerically valued continuous function on [a, b], then 


b 
[ a0) at(s) D0) 
and b b 
u| | a(s) ds) -| g(s) dLU[f](s)]. (10.2.8) 
Proof. In the first case consider a partition of the interval [a, b], say 
Ω τῷ 5, “5) «Φ... <5,=), 


a set of intermediary points {t;}, s;_; Κ t; Κ 5; and the corresponding Riemann- 
Stieltjes sum | 


Σ ΤΟ τσοὺ -- 965-01 (10.2.9) 


By assumption each term in this finite sum belongs to D(U), hence also the sum, 
and by the linearity of U 


U ΣΟ [g(s;) -- a(s;-1)1] = Σ U[f(t;)] [g(s) -- σί;-.4}1. 


As the partitions become finer and finer, the sequence of sums of type 
(10.2.9) and the corresponding sequence with typical element 


Σ ULKE) Lals) -- 966; 


converge to limits 
[ f(s) dg(s) and [ U[f](s) dg(s), (10.2.10) 


respectively. Here we have used that f(s) and U[f](s) are strongly continuous 
and gé BV [a,b]. Because U is closed, it follows that the first integral in 
(10.2.10) belongs to D(U) and (19.2.7) holds. 

The same type of argument applies in the second case. From the equality 
of the Riemann-Stieltjes sums 


ΟἹΣ att) [f(s;) = f(s;_-1)] = Σ att, {U[f(s,)] ae U[f(s;—1)]} 
and the closure of U we get (10.2.8). Jj 


We shall give an example of a large class of linear closed operators which 
play an important role in analysis, particularly in the theory of linear differential 
equations. The assumptions imposed on the operator may strike the reader as 
artificial, but they are satisfied at least in the special case mentioned. 
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Theorem 10.2.4. Given ¥ = C[0,b] with the sup-norm. Let Ὁ be a linear 
operator from X to itself. Suppose that a kernel function K(s, t) is defined and 
continuous in [0, b] x [0, b] such that any function s > z(s) satisfying 


z(s) = y(s) + [ Κα, t) f(t) αἱ | (10.2.11) 


with O<s <b, U[y](s)=0 and εξ is a solution of the functional 
equation 


U[z](s) = f(s). (10.2.12) 


Suppose further that any solution to (10.2.12) is of the form (10.2.11). Then 
U is a closed linear transformation, if N[U], the nullspace of U, is closed. 


Remark. The same result holds if the upper limit in the integral is fixed and 
equal to Ὁ. 


Proof. It is to be shown that {z,} < D(U), z, > Zp together with f, = U[z,]—> fo, 
both in the sup-norm of X, implies that z»eD(U) and U[z]=fo. By 
assumption 


Zz, (8) = y,(5) + Ϊ K(s, t) f,(t) dt (10.2.13) 
0 
for some choice of y, in N[U]. The assumptions on f, imply that 
Ϊ Κα, t) f,(t) dt > [ K(s, t) fo(t) at, (10.2.14) 
0 0 


to start with in the sense of point-wise convergence. Both sides are evidently 
elements of ¥ = ([0, δ] and from 


< bB\ δὰ — 70]: 


[ K(s, t) LA) -- fo(t)] αἱ 


where B = max |K(s, t)|, 0 «1 « 5 « δ, we see that (10.2.14) holds in the sense 
of the metric. 

In (10.2.13) there are three terms. The first and the third converge to 
limits in ¥ as n— oo. It follows that the sequence {y,} also converges to a limit 
γρεῖ. Now each y, belongs to the closed nullspace SLU], so it follows that 
Yo ENLU] and 


Zo(s) = Yo(s) + I. K(s, t) fo(t) dt. (10.2.15) 


By assumption a function of this form satisfies 


U[Zo](s) = So(s), 
so that U is closed. Jj 
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The assumptions are satisfied, in particular, for linear nth order differential 
operators. If 
L[z](s) = χ (s) + a,(s)z%~? (s) + ++ + a,(s) 2(s), (10.2.16) 


where the α,(5) ε C[0, 5], then the solution of 


L[z](s) = f(s) (10.2.17) 


is given by a formula of type (10.2.11), where K(s, t) = G(s, t) is the Green’s 
function of the problem and y(s) is a solution of the homogeneous equation 


L[yK(s) = 0. (10.2.18) 


‘In this setting formula (10.2.11) is obtained, for instance, by the method of 
variation of the parameters. 

The assumptions of Theorem 10.2.4 are as usual unnecessarily restrictive. 
We could replace continuous functions by square-integrable ones. It requires a 
little more work and the result may be desirable for various applications, but the 
present formulation permits a brief proof, exhibiting the essential ideas. 


EXERCISE 10.2 


1. Take X = 6[0, 6], let g(t) be a strictly positive fixed element of ¥ and let U be the 
linear operator on X to itself defined by U[f](t) = (d/dt)[f(@) g(t]. It is not 
assumed that g is differentiable. If h(t) belongs to the range of U, express fin terms 
of h and show that the range is all of ¥. Show that U is closed. 

2. Prove that the domain of U is dense in X. 

3. In formulas (10.2.16) to (10.2.18) take πὶ = 2 and let y, and ν, be two linearly 
independent solutions of (10.2.18), i.e. | 

yi) 2) 


V(t) γ,(ἢ 


W(t; γι: V2) = 0, Vt. 


Form the function 


2(t) = Cyy,(t) + Cyy2(t) + Ϊ WO V2) -- 2) Yi) 


dx, 
0 W(X} V1, V2) ΤῈΣ 


where C, and C, are arbitrary constants. Show that it is a solution of L[z](t) = f(t) 
for any fin X. Verify that the assumptions of Theorem 10.2.4 are satisfied so that L 
is a closed operator on X = C[0, 7 to itself. 

4. Consider the space /, with || {x,,}|| = >0°°., |x,| and the linear unbounded operator U 
which takes {x,} into {nx,}. Is U*, k > 1 closed? 

5. Consider the space L,(0, 1) and the linear operator U which takes f(t) € L,(0, 1) 
into t~?f(t). Is U bounded? Is it closed? Is the domain of U dense? 


6. What happens if in the preceding problem the space L, (0, 1) is replaced by L, (0, οὐ), 
other data being unchanged? 
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10.3 LINEAR FUNCTIONALS 


We shall base the discussion on the properties of Minkowski gauge functions 
first encountered in Section 1.6. Let X be a B-space to start with over the reals. 


Definition 10.3.1. A function p(x) from X to R is called a gauge function for 
the space if (1) p(0) = 0, (2) p(ax) = ap(x), Va, « > 0, Vx, (3) p(x) Σ 0, and (4) 


p(x + y) < p(x) + py). (10.3.1) 


Thus p(x) is positive-homogeneous and subadditive. In a normed linear space 
there are such functions, e.g. B||x|], where β is any positive number. On the 
other hand, if a linear Hausdorff space contains a convex set K containing the 
origin but not the whole space, then K has a gauge function and a normed topology 
may be introduced which is equivalent to the Hausdorff topology [Kolmogorov 
(1934)]. 


Lemma 10.3.1. A gauge function p(x) is continuous for all x. 
Proof. For a> 0 and any γεζ,γ # 0, 
p(x) < p(x — ay) + p(ay) < p(x) + p(—ay) + p(ay) = p(x) + oLp(y) + p(—-y)]. 
Hence, by (1) and (2), p(x) is continuous at the origin; furthermore, 


p(x) < lim inf p(x — ay) < lim sup p(x — ay) < p(x). (10.3.2) 
α- a0 
This holds for all x and y in X. Thus p(x) is continuous everywhere. a 


Theorem 10.3.1. The point set | 
K = {x;xe X, p(x) <1} (10.3.3) 


is closed and convex. 


Proof. If {x,} <K, x, > Xo, then p(x,) <1 implies p(x,) <1 by the continuity 
of p. Hence K is closed. If x, εκ, x, εκ, and if 0 < a <1, then by conditions 
(2) and (4) 


Plax, + (1 — α)χ2) < ap(x,) + (1 — a) p(x) <a + ( -- α) =1, 
so that ax, + (1 — a)x,eK. Thus K is convex. ἢ 


Convex sets made their first appearance in Definition 1.2.3. 

A gauge function is a functional, i.e. a mapping from an abstract space 
into R. It is bounded functional. For boundedness we have given two slightly 
different definitions, one in Section 5.3, the other in Section 7.1. Here we want 
boundedness understood in the sense of (7.1.9), which is as made for the present 
situation. 

In Section 2.3 we were faced with the problem of constructing linear bounded 
functionals on a B-space. We shall now tackle this problem in earnest. For the 
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case of real functionals, the argument is based on the extension theorem of Hahn- 
Banach, for complex functionals on that of Bohnenblust-Sobczyk. 


Theorem 10.3.2. Let X be a B-space over the reals, Xq a linear subspace non- 
dense in ἃ. Let p be a gauge function defined on X with properties as 
stated in Definition 10.3.1. Let fy be a linear bounded functional defined in 30 
where it satisfies 

Sox) < p(x), Vxe Xp. (10.3.4) 


Then there exists a linear bounded functional f such that (i) DL f] = X, 
— (tl) f(x) « p(x), V xe &, and (111) f(x) = fo(x), Vx ε Xp. 


Proof. The proof involves a series of successive extensions, partially ordered by 
inclusion, and an appeal to Zorn’s Lemma for the existence of a maximal element, 
which is the desired functional. 

First let xp € ἃ © ἄρ and form the space 


Δ, = {xx = y + Axo, AER, ye Xo}. (10.3.4a) 


This is a linear subspace of X¥ and it contains X_ as a proper subset. Note also 
that the representation of x as the sum of an element of X, and a multiple of 
Xo 1S unique. We shall need some inequalities. If x,, x,¢X 9, then 


fo(X2) — fo(%1) = folk. — X14) « p(X. — χι) 


and | 
P(X2 — X1) < p(X2 + Xo) + ρί--χ, — Xo). 
Hence 
— pP(—X, — Xo) — fo(X1) < p(K2 + Xo) — fo(X2). (10.3.5) 
Let us put 
n= ΣΡ [-- »(--χ, -- Xo) — fo(x,)], (10.3.6) 
n= Ae [p(X2 + Xo) — fo(X2)]. (10.3.7) 


These are finite numbers and r; < r,. Now choose r so that 
ry Sry, 
then r separates the two sides in (10.3.5) so that we have 
— P(— X1 — Xo) — fol) <r < p(k, + Xo) — fo(&2) (10.3.8) 


for all x,, x, in Xp. 
We can now define an extension from X, to X, by setting 


Aix) = foly) + Ar, X=yt Axo. (10.3.9) 
We have clearly f,(x) = fo(x) if x = y, 4 =0. Further, for a yo in 320 which 
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will be disposed over later we have 


F(X) = foly) + Ar < foly) + ALP(Yo + Xo) — SolYo)J 
= foly — Ayo) + PUAYo + AX). 


Here we set Ayo = y and obtain 


F(x) < ply + AXo) = p(x), 


so that the extension is also bounded above by the gauge function. 

Let E be the class of all linear extensions of fg which are dominated above 
by the gauge function on their domains. Partially order E by setting g <h if h 
is a linear extension of 4, i.e. D(g) < D(A) and g(x) = h(x), Vxe Dig). Let C 
be a chain in E and define a linear functional F by specifying the domain of F to 
be the union of the domains of the members of C, by assigning F(x) to be 
g(x) for χε D(F)O Dg), where σεῦ. Since C is a chain, F is well defined. 
Moreover, it is clear that F is a linear extension of fy and F(x) < p(x), Vxe D(F), 
so that Fe E and it is an upper bound of C By Zorn’s Lemma, E has a 
maximal element Καὶ Here f must be an extension to X since, if its domain were a 
proper subspace Wi and if tte X OM, then the method of constructing ἢ 
guarantees an extension to Wt, ={x;x=y+tdAty,AeR, yeM}. Clearly f 
satisfies conditions (i) to (iii). Jj 


So far we have dealt with a B-space over R and real functionals. The 
extension to the complex case was given by H. F. Bohnenblust and A. Sobezyk 
in 1938. This requires an extension of the notion of a gauge function. 


Definition 10.3.25. A gauge function c(x) is said to be circular if it is sub- 
additive and positive-homogeneous and, in addition, c(0)=0 and for every 
complex number a we have 


c(ax) = {a|c(x). (10.3.10) 


Theorem 10.3.3. Given a B-space X over the complex field and a circular 
gauge function c(x). Given a complex-linear functional F)(x) defined on a 
complex-linear subspace X, of X and such that |Fo(x)| < c(x), Υ χε Xo. Then 
there exists a complex-linear functional F(x) defined on all of 3 such that 
F(x) = fo(x) in 30 and | f(x)| < c(x), Vx. 


Proof. Set RL Fo(x)] = fo(x). This is a real-linear functional defined on Κρ and 
Fo(x) < |Fo(x)| < c(x) on X,. Further, 


F(x) = fo(x) — ifo(ix). (10.3.11) 


We can now use the preceding theorem to find a real-linear extension f(x) of 
fo(x) to all of ¥ such that (i) f(x) = f(x) for x in X, and (ii) | f(x)| < c(x) 
for all x. It is now clear that F(x) = f(x) — if(ix) is a complex-linear extension 
of Fo(x) to all of ¥. Since 3ρ is a complex-linear subspace of ¥ we have 
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iX 9 = X_ with obvious meaning. This implies that f(ix) = fo(ix) on X, so 
that F(x) = Fo(x) on X,. Further, if 9 = arg [F(x)] we have 
| |F(x)| = e7 F(x) = Fe” ®x) = f(e”*x) < cle” ®x) = c(x), 


so that the extension has all the desired properties. " 


Theorem 10.3.4. If Χο #9 is a point in a B-space X, there exists a linear 
bounded functional F defined on X such that F(X) = ||Xo|| and ||F|| ΞΞ 1. 


Proof. We take 
Xo = {X; XK = OX), EC}, F(x) = a@||Xoll, ΧΕ Xp. 


We can take c(x) = ||x|} and note that Fo(Xo) = ||xo||. These data satisfy the 
requirements of Theorem 10.3.3 and the conclusion follows. ἢ 


Theorem 10.3.5. Given a B-space &, a linear subspace 30 non-dense in 3, and 
a point Χι at a distance 4» Ὁ from Xo. Then there exists a linear bounded 
functional F on X such that (1) f(x) = 0, Vx in X_ and (2) || f | = 1/d. 


Proof. We set — 
x a ΧΡ ὰ = XG Oxy: Xo E Xo, ae C}, F(x) = a. 


The representation x = Χρ + ax, 15 unique and F, (x) is a complex-linear functional 
defined on X, and vanishing on ἄρ. We take c(x) = f||x||, where 


Ια! 


B = sup ;XyE Xo, a τ 0}. (10.3.12) 


Xo + ox, |] 
A simple calculation shows that β =1/d. The extension theorem completes the 
proof, giving us a complex-linear functional F of norm 1/d which vanishes on 


Xo: a 


Theorem 10.3.6. If x, # Χ; are two points of a B-space X, then there exists 
a linear bounded functional F defined on all of X such that F(x,) # F(x,). 


Proof. Since x, — x, #0, there exists a functional F defined on X such that 
F(x, — x2) = ||x, — x,|| #0. This F satisfies F(x,) # F(x,). a | 


Actually we can prescribe the values of a functional at two or more points. 
We shall give the argument for two points. 


Theorem 10.3.7. Let x,, Xx, be linearly independent elements of X and c,, cz 
two distinct complex numbers. Then there exists a linear bounded functional 
F on ἃ such that 

F(x;) = ¢1, F(X) = Cp. 


10.3 LINEAR FUNCTIONALS 321 


Proof. Set 
Xo = ἰΧῚ Χ = 0, X, + α2Χ2, αι, a EC}, 
F(X) = aC, + 02C,, ΧΕ Xp. 
This is a linear functional defined on X, and 
Fo(X;) = ¢1, Fo(X2) = ὁ). 


We have to prove that Ερ is a bounded functional. This leads to the following 
question. Suppose 
lo,x, + a,x,|| = 1. (10.3.13) 
Does this imply that. 
|a,C, + &¢,| = G(ay, a) (10.3.14) 


is bounded? If this is the case and the supremum is M, then we can take 
c(x) = M||x|| and apply the extension theorem. | 
Suppose that G(«,, «,) is unbounded, i.e. Fo(x) is unbounded for ||x|| = 1, 
xéX,. This implies the existence of two infinite sequences {«,,} and {a,,} with 
the following properties. If 
Χ, = ας ,ΧῚ + honX25 


then 
|x, a I, IFo(x,)| > 2n. 
Further, 
lim sup |«,,| = lim sup |a,,| = + 00. 
For if only one of the sequences were unbounded lim sup ||x,|| = + oo instead 


of 1. We have now 


Xo 
[Χ,}} = [α εἰ Xy + B,Xall, B,=—. 
Xin 


Here the first factor in the right member of the first relation is unbounded, so there 
is at least a subsequence for which the second factor goes to zero or 
x; = — (lim B,,)X2, 

and this contradicts the assumed linear independence of x, and x,. It follows 
that 

sup {|Fo(x)|; x € Xo, [|x|] = 1} 
exists as a finite number M. We can then take c(x) = M||x|| and the extension 
theorem does the rest. Jj 


We state without proof 


Theorem 10.3.8. Let X,, X2,...,X, be n linearly independent elements of a 
B-space X and let c,, C2,...,C, be n given constant numbers. Then there 
exists a linear bounded functional F defined on X such that 


F(x;) = c;, pf ecl 2, νη; 
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More general existence theorems for linear bounded functionals have been 
studied by several authors (H. Hahn, E. Helly, and F. Riesz among others). 
Theorems 10.3.4 to 10.3.6 establish facts stated in Theorem 2.3.2. 


Definition 10.3.3. A set S in a B-space Δ is said to be fundamental if the 
closed linear extension of S is all of 3. 


In view of Theorem 10.3.5 we have 


Theorem 10.3.9. A necessary and sufficient condition that a set S in a B-space 
X be fundamental is that every linear bounded functional which vanishes on S 
vanishes identically, i.e. is the zero functional. 


EXERCISE 10.3 


1. Prove Theorem 10.3.7. 
2. Prove Theorem 10.3.8. 


3. Suppose that δ, F,,..., F, are nm linear bounded functionals on a B-space X of 
dimension >n. Show that the system 


F,(x) = 0, F,(x) = 0, ..., F(x) = 0 
always has non-trivial solutions. 
Three questions concerning fundamental sets. 
4. Show that {t”; πη = 0,1, 2, ...} is fundamental in C[a, δ]. If 1 = 0 belongs to [a, 6], 
can the unit element 1 be omitted without affecting the result? 


5. The set of functions holomorphic and bounded in the unit disk of the complex plane 
form a B-space under the sup-norm. Show that {z"; n = 0, 12s ...} is a fundamental 
set. Can any power of z be omitted? 


6. Give a fundamental set for L,(—7, 2). 
The next two problems illustrate the question raised in (10.3.14). 

7. Take ¥ = L,(—7,7) and x, = εἶ, x, =e?" and find sup |a,c, + ac,| for the 
stated choice. 

8. Same question for X = C[0,1] with x, =1, x, =+. 

9. Back to L,(—z, 7) and the orthonormal system 

{(2n)#e™"'; n = 0, +1, +2,...}. 

Is it possible to find a linear bounded functional F which assumes preassigned values 
ὦ, for f = f, waere f, runs through the orthonormal basis? 


10.4 INVERSES AND ADJOINTS 


Consider now a linear transformation from X into 3), two B-spaces over the 
complex field. Let Ὁ = D(T) be the domain of 7, R = R(T) the range. We 
recall that T has an inverse, T~*, iff the mapping of D onto ® is 1-1. An 
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equivalent condition is that the nullspace of T reduces to the zero element 


R(T] = {0}. (10.4.1) 


~! is linear. 


We also recall that if T has an inverse, then T 
Theorem 10.4.1. If a closed linear transformation T has an inverse, then 
Τ᾽ is also closed. 


Proof. T is closed iff the set {(x, T(x));x¢D} is closed in the product space 
X x Y. The same set may be written as {(T~‘(y),y);yeR}. Hence, if the 
graph of T is closed, so is the graph of T~*. ἢ 


A closed linear transformation which has an inverse may have a bounded 
inverse. Theorem 2.2.2 asserts that T~ 1c C€(QY, 3) iff R = 9) and there is a 
finite positive m such that 

| T(x) > m||x|, ΥὟνχε Ὁ. (10.4.2) 


The largest permissible value of m is then the reciprocal of the norm of T7?. 


Theorem 10.4.2. If T and its inverse are bounded linear transformations 
from D onto R and from RK onto D, respectively, and if D is closed, so is R. 


Proof. Suppose that {x,} < Ὁ, T(x,)=y,¢eR and that y,—>y,. Then 
IX, — Xmll < 17 ~ "ll ly, — Ymll, So that lim x, = xy exists. This requires that T (Xo) 
exists since Tis bounded and that T(x,) = oe T (X,) = Yo. OF, YoER and R is 
closed. Jj 


We turn now to the adjoint transformation. Here we are dealing with four 
B-spaces, X, Ὁ, X*, Y*, all over the complex field. As the notation indicates, 
X* is the adjoint space of ¥ and Y)* is the adjoint space of 9). 


Definition 10.4.1. Let T be a linear transformation (not necessarily bounded) 
from X to Y with domain D(T) dense in ἃ. The adjoint transformation T* 
has domain D(T*) made up of all elements y* of Ἢ for which there is an 
x* € X* such that 

y*[T (x)] = x*(x), xe D(T). (10.4.3) 


For any such y* we set 
Sad Gud Ξξ aa (10.4.4) 


Some comments are in order. The density of D(T) implies that T*, defined 
by (10.4.4), is a well-defined mapping, D(T*), the domain of ΤῊ is not empty, 
since (10.4.3) is at least satisfied by the trivial linear bounded functional which 
maps every ye Y to zero. From (10.4.3) it follows that D(T*) is a subspace of 
Y*. From (10.4.4) the linearity of T* follows. T* is actually a closed linear 
transformation. This will be shown in Theorem 10.4.3. 
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We shall illustrate the definition by an example which is not completely trivial 
and where the symbols can be unraveled. 


Example 1. We take 

x = X* = L,(0, 1), ἢ = C0, 1], ἢ = BY[O, 1] 
and define T by 

| T[x](t) = " x(s) ds. 


Here, if σε BV [0, 1] defines the functional y*, we have 


y*[T 00] = [ | |, xo ds| ἄγ) 


= |, Ca) -- a(s)1 x(s) a. 


This is a functional x* defined on X as the inner product of x(s) and the conjugate 
of g(1) — g(s). Thus T* takes the functional y* defined by g on Y) into the 
functional x* defined on X by the conjugate of g(1) — g(s). 

The operator T* has a number of properties, interesting in their own right 
but also for the light they throw on T and T™’. 


_ Theorem 10.4.3. Let T be a linear transformation with domain D dense in 
X and range R in Y. Then T* is a closed linear transformation on D(T*) < Y* 
to X*. If, in addition, T is bounded, then T* € (ἢ ἢ, X*) and || ΤῊ] = ||T|]. 


Proof. We refer to Theorem 10.1.3 and its Corollary. It shows that 
S(x) = y*[T(x)] has a unique bounded extension to all of X provided T is 
bounded on D which is dense in ἃ. Thus T* is well defined and it is linear. 
Let us prove that T* is closed. Consider a sequence {y,*} < D(T*) such that 
Yat > Vo*. Then T*(y,*) = x,* -Ὁ χοῦ. Further, y,*[T(x)] = x,*(x), VxeD, 
implies that yo*[T (x)] = x,*(x), VxeED, so that yo* e D(T*) and T*(yo*) = χοῦ. 
This is closure. If, in addition, T is bounded, then S(x) = y*[T(x)] defines a 
linear bounded functional on D and we have 


IS(x)] < 10} ΠῚ ΧΙ. 


Thus D(T*) = Y* and ||T*|| < {{1Π|. To get the reverse inequality we observe 
that, given e > 0, we can find an x, with ||x,|| = 1 such that || T(x,)|| > ||T|| — «. 
If y, = T(x,), we choose a functional y,* ε Ἢ ἢ such that y,*(y,) = lly.ll, l_y.*]] = 1; 
Theorem 10.3.4 guarantees the existence of such a functional. We have now 


IT* > ΠΤ "Ὁ, ΙΓΤ 1 & = [γε] > ITI -- 8. 
Hence [ΤῊ] = [1]. ὶ 
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Theorem 10.4.4. If T is a linear transformation with domain D dense in X, 
then (T*)~* exists iff Ἢ is dense in Ἢ. More generally, the closure of R is 
made up of all points y such that T*(y*) = Ὁ implies y*(y) = 0. 


Proof. If T*(yo*) =0, then [T*(yo)](X) = yo*[T(xX)] = 0, Vx in D, and 
hence y,.* annihilates the closure of R. If, in particular, R is dense in 8), then 
Yo* must be the zero functional and N[T*] = {0} so that T* has an inverse. On 
the other hand, if ® is not dense in % and if y,eY OR, then by Theorem | 
10.4.5 we can find a linear bounded functional y * € %* such that yo*(y,) =1 
while νοῦ annihilates Ἢ. For this functional y,*[T(x)] =0, VxeD. Hence 
Yor € D(T*) and T*(y_*) = Ο while yo*(y,) 4 0. It follows that if R is not dense 
in 9, then T* cannot have an inverse. fj 


Several other results are listed in the Exercise below. 


EXERCISE 10.4 


1. In Example 1 functionals x* € X* are defined. Are they dense in X*? 
2. Does Τ᾽ as defined in Example 1 have an inverse? 


3. Take X= Y=1,, X* = Y* =1,, and let T be the shift operator T(x) = 
O05 Nas cc iss IE RS A ious he) Find ΤῊ 

4. Let T be a linear transformation with domain D dense in ἃ. If 8R(7*) is dense 
(weakly* dense suffices) in ¥*, then 7 has an inverse. Prove! 


5. Let T bea linear transformation with an inverse and such that D is dense in X and R 
dense in 9). Then (Τ Ὁ 7 = (7 ~')*, further T ~! is bounded iff (J *)~1 is bounded 
on X*. Prove! 


6. Prove that a linear transformation 7 with domain dense in X has a linear bounded 
inverse iff R(7 *) = X*. 


10.5 SPECTRA AND RESOLVENTS 


These concepts have been with us for a long time. For matrices they appeared in 
Sections 1.4 and 1.5, for operators in Section 2.6, and for elements of a Banach 
algebra they were the central notions of Chapter 9. In this section we shall 
consider the special case when the B-algebra is an operator algebra, ©(X) in our 
previous notation. Here there are some new features. 
Let 8 = €(%), 1.e. the B-algebra of linear bounded operators from & to &, 
a B-space over the complex field. If T ε €(X) and if ApJ — Thas a bounded inverse 
with domain dense in X, we say that A,¢€p(T), the resolvent set of T, and 
write 
(Aol — T)~* = R(Ao, T). (10.5.1) 


The resolvent set is open and unbounded; it contains the point at infinity of the 
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A-plane and need not be connected. R(A, ΤῸ is B-holomorphic in each of the 
connected components of p(T). The complement of p(T) is the spectrum of T, 
denoted by o(T). It is a closed bounded set and may separate the plane. It is 
confined to the disk 

Ιλ < r(T), (10.5.2) 


where r(T) is the spectral radius of Τὶ 


r(T) = lim |T"|2. (10.5.3) 


There is at least one point of o(7) on the rim of the disk. 
The resolvent satisfies, for 1€ p(T), 


(AI — T) RQ, T) = RO, ΤΊΙ -- T)=1 (10.5.4) 


by virtue of (10.5.1). It also satisfies the two resolvent equations. 
Theorem 9.2.1 of Nagumo now takes the form 


Theorem 10.5.1. In a neighborhood of an isolated point Ag of σ( ΤῈ) there 
is an expansion | 


R(A,T)=J ¥ Q"A—A)y + YAMA, -- 2)". (10.5.5) 
n=0 n=0 . 
Here A, J, Q are elements of ©(X) uniquely determined by ΤΌ J is a 
projection operator (i.e. idempotent), Q is quasinilpotent (or nilpotent) and 
AJ = JA = 0, JQ0=QJ ΞΟ. (10.5.6) 


The first series converges for all 1 #49, while the second converges in the 
largest open disk with center at Ag containing no point of o(T) except 
for do. 


The proof follows directly from Theorem 9.2.1. 
We note that 


J = Joe R(u, T) dp, (10.5.7) 
2πὶ Γ 
where IT is any simple closed rectifiable oriented curve (“‘scroc’’ for short in the 
following) surrounding pu = J, once and leaving the rest of o(T) on the outside. 
This is a special case of a more general spectral resolution. Suppose that it is 
possible to split the spectrum into ἡ disjoint spectral sets 


o(T) = σι, σχ --- σ,» (10.5.8) 


where σ᾽; and o, have a positive distance from each other. We may assume that 
I’, is a scroc surrounding o, and separating it from the rest of the spectrum. 
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Theorem 10.5.2. Set 
] 
Ji -“ττςτ-ς-ς | R(u, T) du. (10.5.9) 
2πὶ Γκ 


Then each J, is an idempotent with 7,0] = 5j,J,; and the idempotents define 
a resolution of the identity 


γι Pee ae (10.5.10) 


Proof. This follows from Cauchy’s theorem for B-holomorphic functions 
together with the first resolvent equation 
R(u, T) — RO, T) 


R(A, T) R(u, T) = ge (10.5.11) 


and the expansion 
RU, T)=ACT +A PRT He HATE T + (10.5.12) 


valid for [1] > r(T). The right member of (10.5.10) is the integral of R(p, T) 
taken over all the scrocs Γμ By Cauchy’s theorem this equals the integral 
taken over a circle |u| =r >r(T). Here we can use (10.5.12) and see that the 
integral reduces to 1, so that (10.5.10) follows. Next we consider the product of 
two integrals | 


Ϊ 
μετ Gore | καὶ τγάλ [ΚΤ a 


Ι : 
= carl Ἱ RQ, T) Κ(μ, T) da ἀμ. 


Here we use (10.5.11). Suppose first that 7 τ k. We then obtain the difference of 
two repeated integrals. One of these can be written 


Ϊ ] dA 
— 1 R(wT)i — | ---,-:ἰᾶμξ 10.5.13 
2πὶ Jr, (, T) | 2πὶ Jrj;A -- -| " ( ) 


since p is outside of Γ;.. In the same manner, but interchanging the order of 
integration, we see that the second integral is zero. 

If now j =k we let I, denote a contour surrounding I, and still leaving 
the rest of the spectrum outside. Now μ is inside Γ᾽ in (10.5.13) so that the 
numerical integral reduces to unity and the repeated integral equals J,. The 
second repeated integral still equals zero. Ε 


A point λοεσί(τ) iff λοὶ -- T does not have a bounded inverse with 
domain dense in X. There are various possibilities. We write T, = AI — T. 
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Definition 10.5.1. The values of 4 for which Τὶ has an unbounded inverse 
with domain dense in X form the continuous spectrum Co(T). The values of 
A for which Τὶ has an inverse whose domain is not dense in ¥ form the 
residual spectrum Ro(T). The values of 2 for which no inverse exists form 
the point spectrum Po(T). The union of Co(T), Ro(T) and Po(T) is the 
spectrum σ(ΤῚ of T. 


Theorem 10.5.3. The four sets p(T), Co(T), Ro(T), Po(T) are mutually 
disjoint and their union is the extended complex plane. 


Verification is left to the reader. 

The nature of a particular spectral value is usually not evident and calls 
for a special investigation. The relations between analytical properties of the 
resolvent and metric or topological properties of the spectrum, on the one hand, 
and the classification of the spectral values, on the other, are quite baffling. The 
following is a simple connection, one of the few known to the author. 


Theorem 10.5.4. If 1 = dg is a pole of the resolvent, then λο € Po(T). 


Proof. In formula (10.5.5) the operator Q is now a nilpotent. Suppose that 
Q” is the zero operator but Q”~' is not. We can then find an element 
Xo € X with ||xo|| = 1 such that Q"~*x, #0. We recall that 


90 Ξ [ -- λο)ὴ.) = J(T — Aol). (10.5.14) 
Cf. formula (1.5.35). This gives 
ΓΤ — AgI] [Q"”* Xo] = [T — λ [JQ"~* xo] = QO" Xy = 0. 
Thus Ag € Po[T] and Q”~*x, is a characteristic vector. i 


Spectral values which are not poles of the resolvent may belong to any one 
of the three spectral classes. Examples illustrating the various possibilities are to 
be found in the Exercise below. 

In the case of an unbounded linear operator U, Definition 10.5.1 still makes 
sense. There are, however, new possibilities: (1) the spectrum may be unbounded; 
(2) the spectrum may coincide with the finite complex plane; (3) the spectrum may 
be vacuous. The last possibility calls for R(A, U) to be an entire function of A. 
It is customary to say that 4 = οὐ belongs to the extended spectrum, and this 
terminology is also used if A = oo is an isolated singularity of R(A, U), even if this 
is not an entire function. Various examples to illustrate these possibilities are 
to be found in Exercise 10.5. 

In discussing the nature of spectral points an alternate approach is often useful. 


Definition 10.5.2. The operator T has the property 

P, if there exists an element Χο of X such that ||xo|| =1 and Tx, = 0; 

P, if R(T) is non-dense in X; | 

P; if there exists a sequence {x,} < 3 such that ||x,|| =1, Vn, and Tx, — 0. 
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With this terminology we can reformulate the classification as follows. 


Theorem 10.5.5. A point A belongs to 

o(T) if T, has at least one of the properties P,; 
Po(T) if T, has P,; 

Ro(T) if T, has P, but not P,; 

Co(T) if T, has P but neither P, nor P,; and 
p(T) if T, has none of the properties P,. 


This is just a paraphrase of the previous classification, but it brings out what 
may be called the “‘pecking order”’ 


P, > P, > P3. (10.5.15) 


P, does not exclude P, or P; but P, decides the classification and so on. 
There are various relations between the spectra of T and of its adjoint T* 
the most important of which is 


oT) = p(T*), if D=X. (10.5.16) 


See further the Exercise below. 


EXERCISE 10.5 


In the next three problems show that o(T ) = {0} and 4 = 0 is an essential singularity of 
R(A,T ) with the spectral classification as stated in the problem. 
1. Ὁ =1,,T takes (x1, Xp, ..-5 Χη» ---) into (x2, X3/2, ..., X,44/N, --.). Point spectrum. 
2. ¥ = C[0,1], T takes f(t) into [ο΄ f(s) ds. Residual spectrum. 
3. ἃ = C,[0, 1], the subspace of C[0, 1] with f(0) = 0,7 as in the preceding problem. 
Continuous spectrum. 


The next problem exhibits a “‘solid’’ two-dimensional point spectrum. 


4. X = C[0, co], 7 takes f(r) into f(t + A), where ἢ is a fixed positive number. Show 
that Po(T) = {A; (|A| <1) vu {1}}. Show that the rest of the unit circle belongs 
to Co(T ). 


The next three problems exhibit various spectral possibilities of unbounded operators. 
5. ἃ = L,(—7,7), U takes f into f’. Po(U) = {ni,n = 0, +1, +2, ...}. 
6. Take the Kolmogorov matrix A = (a,,,) with αι. = —1, ay2 =1, α,,.--ἰΞ  --ΌἸΊ, 


Ann = — 1, 4,441 = 1forn > 1, all other a,,,, being equal to 0. Use A as an operator, 
y = #x, on a weighted sequence space to itself with 

2 xa 

|x| =r. 
1 (2n)! 


Here the system Ax = ἀχ is satisfied by a vector x(A) where the nth component is a 
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polynomial P,(A) of degree n — 1, P,(A) = 1. Show that P,(A) has positive integral 
coefficients, P,(0) = 1 and |P,(A)| < n!|A|"~* for |A| > 1. Hence show that the point 
spectrum of the operator A covers the finite plane. 


. X = C,[0,1], U takes f(t) into —/f’(t). Prove that RU, U) exists and 


κά, U)[f] = | ji eS DF(s) ds, 


e 


which is a Cy-valued entire function of A. 


. Let X = L,(—7,7) and let T be a shift operator on the Fourier coefficients which 


takes 12, fe" into)? fyaie". If F(A) is the nth Fourier coefficient of R(A, ΤΥ 7] 
when it exists, show that 


F(A) = 2 i ον e Ss, 
21 


ΞΕ A —e is 


Discuss the existence of F,,(A) as a function of A and determine the spectrum of T. 


. Show that {8 (6 ) εἰ, iff f()[e7  — e~*]-1 | L,(—7,72), w real. What bearing 


does this have on the nature of the spectrum? 


. Verify Theorem 10.5.1. 

. Fill in missing details in the proof of Theorem 10.5.2. 

. Verify (10.5.14). 

. If Jis the idempotent in (10.5.5), if 2 = A) belongs to Po(T ) and xq is a characteristic 


vector, prove that Jxp = Xp. Prove also that (A — 4p) R(A,T) Xp = Xp. 


. Prove (10.5.16). Prove also that R(A,T7*) = R(A,7T)* both under the assumption 


that D is dense in X. 
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11 INNER PRODUCT SPACES 


Many threads went into the fabric displayed in Chapters 1 and 2. Most of these 
threads have been examined in some detail in preceding chapters, but we still have 
to give a fuller account of inner-product spaces, introduced as a general concept in 
Section 2.5. The spaces C”, /,, and L, are important instances of such spaces with 
which we are familiar. The linear bounded functionals on these special spaces are 
defined by inner products and are associated in a natural manner with the theory of 
quadratic and bilinear forms. These in their turn generalize in a natural manner to 
the relation between an operator and its adjoint 


(Tx, y) = (x, Τ᾿). 


The case where T is self-adjoint, T* = T, requires that the spectrum of T be real 
and leads to the notion of the associated spectral function and the resolution of the 
identity defined by it. These are the main topics in the discussion. 

There are four sections: Review; Spectrum and the numerical range; 
Operational calculus; and The spectral theorem. 


11.1 REVIEW 


The following is a brief reminder of the basic concepts presented in Section 2.5. 

We consider a linear space X over the complex field and a mapping taking 
ordered pairs x, y € ¥ into an element (x, y) of C’, called the inner product of x 
with y. See Definition 2.5.1 for the properties of (x,y). We recall that (x, y) is 
bilinear and skew-symmetric. We introduce a norm in & by setting 


IIxl] = Lx, χ) Τ᾽ (11.1.1) 
and define the metric by 


d(x, y) = |x — σ]. (11.1.2) 


If X is complete in this metric it is called a Hilbert space, for which we use the 
generic notation 5. 

For fixed γε e X, the mapping x > (x, y) defines a linear bounded functional 
I(x) on ¥. It was shown that the norm of the functional equals |ly||. If X = 5, all 
linear bounded functionals are given by inner products, i.e. to /e ἢ there is a 
y € § such that /(x) = (x,y). There is a 1-1 correspondence between § and $* 
which is an isometry and a skew symmetry. Thus § is its own adjoint space. 


331 


332 INNER PRODUCT SPACES 11.1 


For any two vectors of ¥ the Parallelogram Law holds: 
IX; + X2|]7 + |x, — Χμ]! = 2[||χ47 + [[Χ22]. (11.1.3) 
Two vectors x, and x, are said to be orthogonal or perpendicular, in symbols 


x, 1 x), if 
(X45.X5) = 0: (11.1.4) 


For such vectors the Pythagorean Law holds: 
Xp + X2|]7 = [xi]? + [Χ}}7.Ψ (11.1.5) 
Let MN -- H be a closed linear subspace of §. To Mt corresponds another 
subspace denoted by Mt* and known as the orthogonal complement of IN. Here 
(uyv)=0 if ucM and ve Me. (11.1.6) 
Any x € §, x # 0 admits a unique representation 
x=udy, u EM, ve Me, H=MO Me. (11.1.7) 
This defines a mapping P of § into M: 
P(x) = P(u + νὴ = P(u) = u, P(v) = 0. (11.1.8) 


P is a projection operator, i.e. an idempotent element of the operator algebra E(§). 
For St 4 {0} we have 


P=P, |Pi|=1, RP] =mM-. (11.1.9) 


The operator J — P is also a projection. It projects § onto M+. Its nullspace is M 
and its norm is 1 unless I~ = {0}. 

If $ is infinite-dimensional, we can find a countable set of elements {v,}, any 
finite number of which are linearly independent, and by the Gram—Schmidt process 
we can obtain an orthonormal system {u,}. This leads to the formal Fourier series 


00 


x~) ἅμ, ἄμ Ξ (x, u,). 41.1.10) 
1 


The partial sums of this series form a Cauchy sequence in § with the limit ¥ which 
may or may not coincide with x. Necessary and sufficient conditions for Χ = x are 
found in Theorem 2.5.2. 

We can generalize (11.1.7) in an obvious manner: 


H= IM, MM, = {0}, T#k, (11.1.11) 
and | 
X= 2X, x, Ε M,, (x;, Χμ) = 0, J#k. (11.1.12) 


Here the sums are finite or countably infinite. Such a decomposition may be based 
upon an orthonormal system and the linear manifolds spanned by the elements 
(singly or grouped) which form closed linear subspaces any two of which are 
orthogonal. 
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To the decomposition (11.1.11) corresponds a set of projection operators ῥ᾽ 
such that 


where x; is the vector occurring in (11.1.12). If the set of subspaces Mi; is finite, 
say m in number, then we write 


l=P, +P, +--+ Pp (11.1.15) 


and speak of this as a resolution of the identity. If the set is infinite, we can still 
write 
I= ΣΡ, (11.1.16) 


and note that the partial sums of the operator series converge to J in the strong 
operator topology. Each P, is an element of €(), the space of linear bounded 
operators from § into itself. 

If T ¢ E(H), a basic tool in the study of T is given by the inner products 
(Tx, x) and (Tx, y) which are the quadratic and polar forms, respectively, that 
correspond to T. The adjoint operator T* is then defined by 


(Tx, y) = (x, T*y) (11.1.17) 


for all x, ye. Here T ε €(H) implies and is implied by T*e &(H). If, in 
particular, 
Τ᾽ Ύ ΞΞῚῚ, (11.1.18) 


then T is said to be self-adjoint or Hermitian and the corresponding forms (Tx, x) 
and (Tx, y) are also called Hermitian. 

In working with polar forms the analogue of (1.7.8) is a convenient tool. For 
all x, y € Ὁ and all complex numbers 5, ἢ we have 


st(Tx, y) + st(Ty, x) 
= (T(sx + ty), sx + ty) — |s|?(Tx, x) — |t|?(Ty,y). (11.1.19) 


This identity implies that (Tx, x) = 0 for all x holds iff T is the zero operator in 
analogy with Theorem 1.7.2. For ifs=t=1 


(Tx, y) = —(Ty, x) 
and for s=1, t =i we find that (Tx, y) = (Ty, x) so that 
(Tx,y)=0, Vx,y. 


In particular, for y = Tx we get (Tx, Tx) = ||Tx||* = 0 for all x and hence Tis the 
zero operator. 
This fact implies first that T* is unique, secondly that 


(ΤῸ = T, (11.1.20) 
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so that the operation of taking the adjoint is an involution. Among other 
consequences we list 


Theorem 11.1.1. If T € €(), then (Tx, x) is real for all x iff T is Hermitian. 


Proof. (1) T Hermitian implies (Tx, x) real, Vx. This was proved in Theorem 
22 2.17. 


(2) (Tx, x) real for all x implies that T is Hermitian. For 
| (Tx, x) = (Tx, x) = (x, Tx) = (x, T*x), 
so ΤῊ = T by the uniqueness of the adjoint. Jj 


We recall that if S and T are Hermitian operators in €() and if « is real, then 
S + T and «T are Hermitian, while ST is Hermitian iff S and T commute. We 
also recall that an arbitrary operator A in €() admits of a unique representation 
by Hermitian operators, namely 


I 
A=B+iC, B=}A+4%, C= >(A-A¥*). (1120 


Finally, we recall the definition of normal and of unitary operators. A is normal 
iff A and A* commute, which is the case iff B and C commute in (11.1.21). U is 
unitary if 

UU* = U*U =I, or, equivalently, U7! = U*, (11.1.22) 


EXERCISE 11.1 


1. The norm defined by (11.1.1) is differentiable in the sense that 
ey 
lim — [fx + ol] — [x] 


exists if « decreases to zero. Find the limit. What happens at x = 0? Cf. Section 8.4. 


2. The norm is differentiable in this sense for each of the spaces Land L,,l<p<o. 
Verify this for /,. 


. Verify (11.1.9) and statements about J — P. 

. Verify that the partial sums in (11.1.10) do form a Cauchy sequence. 
. Why is the representation (11.1.12) unique? 

. Why does T ε ©(H) imply T* € E(H), and vice versa? 

. Show that 7 * is unique. 

. Show that (7,7,)* = T,*7,*. 

. Prove that S* = 7 Ἐ implies S = 7. 

10. Verify (11.1.20). 
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11. Prove that if T is Hermitian, so is 7", ἡ = 2,3, ...,and so is any polynomial in T 
with real coefficients. 


12. If T has an inverse, show that so does T* and that (T~')* = (7*)™*. 
13. If T is Hermitian, show that |/7*|| = [7 ||’. 
14. Prove that ||7*|| = ||7 ||. 
15. Let T € &(H) and let 1 = 7 belong to the resolvent set of T. Show that iJ + T and 
(iI — T)~* commute. 
16. With Τ᾽ as in the preceding problem, form the Cayley transform 
σα) (εἰ -- ΤΊ) -- ΤΥ" 
and show that C(7) is unitary if 7 is Hermitian. 


11.2 SPECTRUM AND THE NUMERICAL RANGE 


We return to the quadratic form (Tx, x) for any T in (9). If x is restricted to the 
unit sphere, the values taken on by (7x, x) constitute the numerical range of T, 
denoted by W(T). Here the W stands for the German term “‘Wertvorrat”’. Thus 


W(T) = {a; (Tx ,x) = a, ||x|] = 1}. (11.2.1) 


This is a set of complex numbers which has interesting geometrical properties and 
is closely related to the spectrum of T. More precisely we shall prove 


Theorem 11.2.1. \W(T) is a bounded convex set. 
Theorem 11.2.2. The spectrum o(T) of T is a subset of the closure of W(T). 


The first result is known as the Toeplitz—Hausdorff theorem. Otto Toeplitz 
(1881-1940) proved in 1918 that the boundary of W(T) is a convex curve but left 
open the possibility that there could be holes in the interior not belonging to 
W(T). This gap was filled by Felix Hausdorff in 1919. For the proof of this 
theorem we need the simple 

Lemma 11.2.1. If s and t are complex numbers, then 

W(sT + tl) = sW(T) + ¢, (11.2.2) 

i.e. the numerical range of an affine transform of T is the same affine transform 

of the numerical range of T. 

Proof. Note that 
((sT + tI)x,x) = s(Tx, x) + t(x,x) (11.2.3) 
which for |x|] = 1 implies (11.2.2). Jf 


Corollary 1. \W(T) is convex iff W(sT + tl) has this property for all s,and 1. 
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Proof of Theorem 11.2.1. It is given that W(T) contains two complex numbers « 
and f and it is desired to prove that the line segment joining « and β is also in W(T). 
It is no restriction to assume that « = 0 and f = 1, for we can always use an affine 
transformation z —> sz + t which sends « into 0 and f into 1 and T into sT + 1]. 
Since straight line segments go into straight line segments under an affine trans- 
formation, the line segment joining « and f goes into the line segment joining 0 and 1. 
In this modified formulation of the question we have two unit vectors x and y 

such that | 
(Tx, x) = 0, (Ty, y) =1 (11.2.4) 


and it is required to prove that the line segment [0,1] belongs to W(T). In the 
canonical representation (11.1.24) for T, say 


T=B+iC, 
we have 
(Cx, x) = 0, (Cy, y) Ξ 0 (11.2.5) 


by virtue of (11.2.4). The unit vectors which satisfy (11.2.4) are not uniquely 
determined and could be replaced by w,x and q,y, respectively, where 
|w,| = |@,| =1 without affecting (11.2.4) and (11.2.5). We can use this added 
amount of freedom to require that the real part of (Cx, y) is 0, 


R(Cx, y) = 0. (11.2.6) 
With x and y chosen in this manner we set 
zZ(s)=sx+(1-—s)y. (11.2.7) 


Since ||z(s)|| < s||x|| + ( — s) |ly|]| =1, the vector z(s) is confined to the unit ball. 
It may possibly reduce to the zero vector for some value of s. This would mean 
Sx = (s — 1)y and since x and y are unit vectors, this can only happen for s = 4. 
But if x = —y, then (Tx, x) = (Ty, y), which contradicts (11.2.4). Thus ||z(s)|| > 0 
ἴο 0 «5 «1. Using (11.1.19) with 5. Ξ 5, t=1—-s we get 


(T[z(s)], z(s)) = s?(Tx, x) + (1 -- s)* (Ty, y) 
+ s(1 — s) [(Tx, y) + (Ty, x)]. (11.2.8) 
Here 
(Tx, y) + (Ty, x) = 2R[(Bx, y)] + 27R[(Cx, y)] = 2R[(Bx, y)] (11.2.9) 


by virtue of (11.2.6). It follows that (11.2.8) is real. Since ||z(s)|| < 1 with equality 
assured only at the endpoints, we replace z(s) in (11.2.8) by the unit vector 
z(s)[||z(s)||]~*. The result is an element of W(T) and a continuous function of s 
which takes on the value 0 for s =1 and 1 for s=0. It follows that all inter- 
mediary values are taken on at least once for s in [0,1]. Hence W(T) is actually 
convex. a | 


Corollary 2. If T is Hermitian, W(T) is an interval on the real axis, more 
precisely a subinterval of 1 -- [ΤΊ], \|T ||J- 
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Proof. \W(T) is convex and the points of W(T) are real by Theorem 11.1.1. A real 
convex set is an interval. This interval must be a subinterval of [—||T||, || T ||] since 
for ||x|| = 1 
\(Tx,x)| < ITI. Κὶ 
As a preliminary to the proof of Theorem 11.2.2 we recall that the spectrum of 
a linear bounded operator T in (9) breaks up into three disjoint parts: 


(i) The point spectrum Po(T). The linear transformation T, = AI — T is not 
1-1 for A = A, € Po(T) and there is a unit vector Χρ such that Tx9 = λοχο. 


(ii) The continuous spectrum Co(T). The mapping Τ᾽ is 1-1 for A = Ay € Co(T) 
but the image of the unit sphere is not bounded away from the zero 
element. There exists a sequence of unit vectors {x,} such that 
AoX, — Tx, — 0. | 

(iii) The residual spectrum Ra(T). The mapping T,, is 1-1 but the range of Τλς 
is not dense in §. 


These conventions recalled, we can now proceed to the 


Proof of Theorem 11.2.2. We take the several parts of the spectrum one at a time. 
(1) Ap € Po(T). Since there is an Xg with [Χο] = 1 and Tx9 = AoXo we have 
(TX9, Xo) = (λοχο; Xo) = λοίχο: Χο) = Ao 
and Po(T) belongs to W(T). 


(2) 4, € Co(T). By the definition of the continuous spectrum there is a 
sequence of unit vectors {x,} such that Ax, — Tx, goes to zero. Hence 


(Tx,, X,,) ms λοίζχ,; Χ,) - ho 
and A, belongs to the closure of W(T), if not to W(T). 


(3) Ap € Ro(T). The range of T,, is not dense in §. Since the range is a linear 
subspace of § there exists a linear functional which annihilates R(7,,) without 
being identically zero. Since it is a functional on § we can find a y € § such that 


(T,,),y) =9, VxeH, — lly =1. 
This gives 
(Tx, y) = Ao(x, y), V X, 


and for x = y we get (Ty, y) = Ay, 80 A, belongs to W(T). Ε 


Corollary 3. The convex hull of the spectrum is confined to the closure of the 


numerical range 
conv [o(T)] S W(T). (11.2.10) 


Here the left member may very well be a proper subset of the right. An 
extreme case is furnished by quasi-nilpotent operators where o(T) = {0} while 
W(T) = {0} iff T is the zero operator. The other extreme, equality in (11.2.10), is 
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furnished by normal operators. To prove this we need some preliminary results 
which are of independent interest. 
Lemma 11.2.2. If A is normal, then 
[45] = All”. (11.2.11) 


Proof. We note first that for a normal operator A 


(Ax, Ax) = (A*Ax, x) = (AA*x, x) = (A*x, A*x), 
so that 
|| Ax|| = || A*x][, Vx. (11.2.12) 


Formula (11.2.11) is trivially true for n = 1. Suppose that it holds forn < k. 
Then 


|| Akx|]? = (Ax, A*x) = (A*~ 1x, A*A*x) 
< | A‘~*x|| |[A*A*x]] = AT *x|] [4 ὖχ!] 
by (11.2.12) with x replaced by A*x. Hence 
JA*x||? < APA] ARTE, νχ, Ix <1, 
and this inequality must hold also for the supremum of the left member, so that 
[4112 < AEA] ART AL. 
By the induction hypothesis this gives 


[4125 < IAI? APT 
or 
[416 < ART. 


Since the converse inequality is trivially true for all operators, (11.2.11) holds for 
n =k +1 and hence for all ἡ. | 


Corollary 1. The spectral radius of a normal operator equals its norm, 


r(A) = |All. (11.2.13) 
Proof. Use (11.2.11) and the definition of r(A). Jj 


In view of the properties of the spectral radius we have 


Corollary 2. The spectrum of a normal operator A, which is confined in the 
disk [1] < [4], has at least one point on the circumference of the disk. 


This implies 


Corollary 3. For ||x\| = 1 and A normal 
sup |(Ax, x)| = ||All. (11.2.14) 
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Proof. We know that the supremum is at most equal to || All, i.e. W(A) is confined 
to the disk. Now W(A) contains conv [o(A)], a fortiori it contains o(A) at least one 
point of which lies on the rim of the disk. This implies (11.2.14). 


In particular we get 


Corollary 4. If T is Hermitian, then at least one of the endpoints of the interval 
[—||T ||, {11 belongs to o(T). 


There is an interesting consequence of Corollary 3 which is worth stating as 


Lemma 11.2.3. If A is normal and there exist χορ and Ag such that [Χο] = 1, 
(AXo, Χο) = Aq and ἰλοί = ||Al|, then AXo = λοχο: So that Ay € Po(A). 


Proof. We have 
Al] = ἰλοὶ = |(AXo, Xo) < | AXoll [Χο]! < |All. (11.2.15) 


This requires that equality holds throughout the relation. Now by Theorem 2.5.1 
equality requires that there is a constant y such that Ax ) = yx, and since 
(AX, Xo) = Ap we must have y = Jy, so that A, € Po(A) as asserted. i 


We can sharpen Corollary 4 in a direction which is important for the 
following. 


Lemma 11.2.4. For T Hermitian set 
m=inf{A;sAeW(T)}, M =sup{d;leW(T)}. (11.2.16) 


Then 4 =m and ἡ =M belong to o(T) and a(T) is a subset of the interval 
[m, M]. 


Proof. That o(T) < [m, M] follows from (11.2.10). We have only to prove that 
the endpoints of this interval belong to the spectrum. To prove this we use suitable 
affine transformations on T and the induced transformations on the numerical 
range and spectrum. Formula (12.2.2) shows how the numerical range is affected 
by an affine transformation. We have, similarly, 


a(sT + tl) = so(T) +1, (11.2.17) 
which holds for any T € €(§). In the particular case where T is normal, sT + 11 
is also normal and its spectral radius equals its norm. We now take s = 1, t = —m, 


so that 
W(T — mI) =W(T)—-—mc[0,M — μη]. 
Here M εὐν(Τ), so 
M — m= sup {A4;A4€W(T — ml)}. 
In the other direction we have 


0 < inf {A;AeEW(T — ml}. 
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These two relations together with (11.2.14) give 
ΠΤ — mI|| = M -- μι, 
and by Corollary 4 of Lemma 11.2.2 this implies 


M —meo(T — ml), 
whence 
Meo(T —ml)+m=o(T). 


Choosing s = 1, t= —M, we prove in the same manner that me o(T). ἢ 


Definition 11.2.1. A Hermitian operator T is said to be positive if 0 « m, 
positive definite if 0 < m, negative if M < 0, and negative definite if M < 0. 
If T, and T, are two Hermitian operators, then Τὶ < T, shall mean that T, — Τὶ 
iS positive. 


Lemma 11.2.5. If A is normal and A = B + iC where B and C are Hermitian, 
then 


o(A) S [β + iy; Be o(B), γ € 0(O}}, (11.2.18) 
W(A) «- W(B) + iW(C). (11.2.19) 


Proof. Since B and C commute, J, B, C may be embedded in a commutative 
B-algebra °° to which Gelfand’s Representation Theorem 9.3.3 applies. If yu is 
a linear, multiplicative, and bounded functional on WY, then 


(A) = μ(Β) + ἱμ(() (11.2.20) 


by the linearity of the functional. Here u(A), μ(Β), μ(() are spectral values of 
A, B, C, respectively. Moreover, the functional can be varied so that every spectral 
value is represented. This, however, does not mean that there is identity between 
the two sides of (11.2.18); inclusion is the best that can be asserted. 

The inclusion (11.2.19) follows from the identity 


(Ax, x) = (Bx, x) + i(Cx,x). J 
We can now prove 


Theorem 11.2.3. For normal operators the closure of the numerical range 
coincides with the convex hull of the spectrum. 


Proof. The convex hull of o(A) is the intersection of all closed half-planes which 
contain σ( 4). Leth be such a half-plane. Since an affine transformation obviously 
leaves inclusion properties invariant and also preserves normality of operators, we 
may assume that ἢ is the closed right half-plane: μι Ὁ where A=p+ iv. If 
A = B + iC is the canonical representation of A, we see that Bisa positive operator. 
For B is Hermitian and if the spectrum of A lies in the closed right half-plane, then 
o(B) must lie on the positive real axis including the origin and this makes B positive. 
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It follows that the numerical range W(B) is real non-negative and this means that 
W(A) is located in the closed right half-plane. From this fact the general assertion 
follows. ἢ 


EXERCISE 11.2 


1. A linear transformation from C7? into itself is defined by the matrix ,,¢€9, which 
has a one in the place (1, 2) and zero elsewhere. Use the Euclidean metric in C* and 
find o(T ) and W(T ). 

2. Let {A,; 2 = 0, +1, +2, ...} be given and consider an operator T on L,(—7,7) 
defined by 


ΤΙ ‘KG Σ 1, f,e"", fr pe 


How should the 4’s be chosen so that T € €(L,)? If this condition is satisfied, find 
o(T ) and W(T). 

3. Piove that the operator is normal if it is in ©(L,) and find its representation in terms 
of Hermitian operators. 

4. Prove that an affine transformation V really has the two properties stated in the text: 
(1) If δ᾽, and S, are two sets in the complex plane and S, € S,, thenV(S,)  V(S,). 
(2) If A is normal, so is V(A). 

5. Suppose that Ais normal and A = B + iC. The spectrum of A is projected vertically 
on the real axis and horizontally on the imaginary axis. Show that the first projection 
coincides with o(B) and the second with o(C). Note that (11.2.18) is a weaker 
statement. 

6. Suppose that 7 is Hermitian and that v > 0, ὁ = uw + iv. Consider the resolvent 
R(A,T ) and prove that it is a normal operator. Find its spectrum and numerical 
range. Show that both lie in the open lower half-plane. Here 1 is fixed. 

7. Prove that a Hermitian operator 7 # 0 cannot be quasi-nilpotent. 

8. Prove that (11.2.12) is sufficient for A to be normal. 


9. Prove that if x; and x, are characteristic vectors of a normal operator A correspond- 
ing to distinct characteristic values, then (x,, x2) = 0. 


10. Let A be normal, x, a characteristic vector with the characteristic value 1). Prove 
that x, is also a characteristic vector of A* corresponding to the characteristic value 
Ie. 

11. Prove that if Uis unitary, then o(U) is a subset of the unit circle {A; [1 = 1}. 

12. Prove that the product of a finite number of unitary operators is unitary. 


11.3 OPERATIONAL CALCULUS 


The study of Hermitian operators, bounded as well as unbounded, started and had 
an explosive development around 1930 with work by J. von Neumann (1903-57) 
and M. H. Stone. Among the forerunners should be mentioned T. Carleman 
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(1892-1948), E. Hellinger (1883-1950), D. Hilbert, F. Riesz, and H. Weyl 
(1885-1955). While the work of von Neumann and of Stone can be said to be 
motivated by the requirements of quantum mechanics (or, equivalently, singular 
boundary value problems raised by physics), the forerunners were concerned with 
functions of infinitely many variables—in particular, infinite quadratic forms, 
singular boundary value problems, and singular integral equations. 

Within the framework of this treatise there is no room for this vast theory in 
satisfactory detail, but we shall give a brief account of some of the basic facts 
essentially following the ideas developed by R. Riesz in 1930. 

Let us revert to the theory of Hermitian matrices given in Section 1.7. In 
Exercise 1.7 the reader will find some of the problems which can serve as suggestive 
analogues for what must be done for Hermitian operators on a Hilbert space. Let 96 

be a Hermitian matrix in Mt, {A,} its characteristic values, 


Ay < dg « ..- « λ,» 
4, the idempotent matrix (= projection matrix) corresponding to 4;. We have then 


δ᾽ ἘΞ, Reon Bs one ae (11.3.1) 


Sy 4) 
Rie 
ay eee ae Be 


More generally, if t > F(t) is a function holomorphic on the spectrum of J€, we 
can define 


Sn 


F (HK) = F(A) 8. + F(A) I, + + Ε(λλὴξ,. (11.3.3) 


Compare similar considerations for B-algebras in Section 9.4. 
We can write these finite sums as Stieltjes integrals with a matrix-valued step- 
function as integrator. Set 


O,-o<t<ad,, 
&(t) — Sy ἘΠ 4; fie She ae Sis λι <i < Anti» (11.3.4) 
J, A, “1 « @. 


Here k goes from 1 to ἡ —1 and J denotes the unit matrix of Nt,. Then 


F (KE) = Ϊ F(t) d&(t) (11.3.5) 
or, equivalently, as an ordinary Stieltjes integral representing the Hermitian form 
(F ον, x) = | F(t) d,(&(t)x, x). (11.3.6) 


In both formulas as well as in (11.3.3) we can allow ¥ to be merely continuous. The 

reader will have discovered that F(t) = t in (11.3.1) and (A — ) 1 in (11.3.2). 
These are the formulas to be generalized to Hermitian operators T on an 

arbitrary Hilbert space §. To each such T corresponds a spectral function E(t). 
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This is an operator from § into itself for all real ¢ and belongs to €() if T is 
bounded. It defines a resolution of the identity (some authors say that it “15. a 
resolution of the identity). The operator E(t) is non-decreasing with respect to ὦ 
in the sense of Definition 11.2.1. It is a projection operator; E(s) commutes with 
E(t) and with T for all sand ¢. Further, E(s) E(t) = E(s) for s < tand E(t) — E(s) 
is also a projection. If T is bounded and its spectrum is confined to the interval 
[a, b], then E(t) = Ofort < aand E(t) = I for b < t. The reader should verify that 
the operator E(t) defined by (11.3.4) has all these properties. 
We have now 


(F(T)x, x) -[ κω 4(E@x,x), (11.3.7) 


If T is bounded this holds for all χε Ὁ and all functions F continuous on the 
spectrum of T. If T is unbounded, x and F are subject to limitations. 

The problem is now to find the spectral function E(t) and to study its 
properties. The basic idea of F. Riesz was .to obtain F(T) directly by a limiting 
process for positive continuous functions F and bounded Hermitian operators T 
and to reduce the problem of finding E(T) to that of finding the operator T* 
corresponding to 

t3t* = max (0,f). (11.3.8) 


In the matrix case JCt would be obtained by summing over the non-negative 
characteristic values in (11.3.1). The mapping t > F(t) corresponds to the map- 
ping T — F(T) and the mapping F(t)-— F(T) has the fundamental property of 
preserving positivity. The importance of such mappings has come to our attention 
repeatedly and this will not be the last time. 

The mapping F(t) — F(T) is easily effected if F is a polynomial: in the 
expression for the polynomial we simply replace the scalar-valued variable ¢ by the 
operator-valued T while numerical constants are replaced by J times the constant 
in question. To a sequence of scalar polynomials {F,,(t)} corresponds a sequence of 
operator polynomials {F,(T)} and, as will be seen, monotonicity of the first 
sequence implies the same property for the second one in the sense of the partial 
ordering. 

We shall need a number of ΠΝ results concerning positive functions and 
positive operators. We start with a theorem due to F. Hausdorff (1921) concerning 
polynomials of the form 


P(t) = = c,(t — αὐ (ὃ -- 1)", (11.3.9) 


where —0o <a<b< οὐ. For the special case a = 0, b = 1 we encountered such 
polynomials in the proof of Theorem 3.2.4 (BernStein’s Approximation Theorem). 
Let P(a, δ) denote the set of all such polynomials for fixed a and δ. Actually this 
class contains all polynomials and, moreover, any polynomial may be written in the 
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form (11.3.9) in infinitely many ways. This is due to the fact that for all positive 
integers n we have 


l=(b-a")> (1 (t -- a) (ὁ -- 5". (11.3.10) 

k=0 
It should also be noted that P(a, δ) is an algebra. In particular, the product of two 
polynomials in P(a, δ) is also in P(a, δ). It is a question here of formal invariance. 
The product is obtained by termwise multiplication and collection of terms. 
Hausdorff’s theorem is concerned with the subclass P* (a, δ) containing all poly- 
nomials of the form (11.3.9) with non-negative coefficients c,. This class is closed 
under addition, multiplication by positive numbers, and element-wise multiplication. 


Theorem 11.3.1. Each polynomial which is positive in [a,b] equals some 
element of P* (a,b). Conversely, each element of P* (a, b) is positive in (a, b). 


Proof. The converse part is obvious by inspection. To prove the direct part, we 
note that a polynomial real on the real axis admits of a unique factorization into 
linear and irreducible quadratic factors. If the polynomial is positive in [a, b], then 
each of the factors may be assumed to be positive there. A linear factor is then of 
the form 

L(t) = (ὃ -- a) '[C,(t -—a) + C,(6 -- 6], (11.3.11) 


where C, = L(b) and C, = L(a), both positive numbers, so that L € P*[a, bd]. 
Quadratic factors require more elaborate manipulations. We write such a factor in 
the form 


Q(t) = A+2B(t—a)+C(t—ay, A>0, C>0, AC—B?>0. (11.3.12) 


Here we multiply the three summands in Q(t) by the right member of (11.3.10) 
with n = p, p —1, p — 2, respectively, and where p is a positive integer to be 
determined. Thus 


(0) = Ab - a? Σ (2) -- α"ῷ - 0 
+ 2B(b — a)~P*t i (; Η ἢ (t -- αὐ" (ὁ -- a 
+ b= ar? Σ (P73) Gan e— ne 


P — 2)! 
= (b = a) P yi C(t = a)” (ὁ — ted 


where 
Cn = Ap(p — 1) + 28Β(0 — a) (p — 1) m+ C6 -- a)? m(m— 1). (11.3.13) 
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This is a quadratic polynomial in m which will be positive for all values of m 
provided the discriminant is positive, which will obviously be the case for sufficiently 
large values of p. Thus Q(t) has representatives in P*(a, δ). Since each factor is 
representable, so is the product, i.e. each polynomial which is positive in [a, 6] is 
equivalent to an element of P* (a, δ). fj | 


We shall be concerned with sequences of polynomials {P,,(t)} which converge 
uniformly to a given continuous function F for ¢ in some interval [a,b]. It is no 
restriction to assume that the sequence is also monotone increasing (or decreasing). 
For if this is not the case at the outset, we can construct a new sequence with the 
desired properties as follows. We first select a rapidly convergent subsequence 
{P,,} such that 

IP, (t) — Ε()] <3™, Vn, axt<b, 
and set 
0, (ἢ = P,,,(t) — 27. 


Om+1(t) — On(t) = Pr, (ἢ) — Pp, (Ὁ + 2°"? 
S240 81) 42-74 0 (11.3.14) 


Then 


for m > 2. Thus the sequence {Q,,(t); m > 2} is increasing for all ¢ in [a, b] and 
obviously converges uniformly to F(t). 

We shall take up some properties of positive operators. We start by observing 
that the square of a Hermitian operator is positive since 


(T?x, x) = (Tx, Tx) = || Tx||. (11.3.15) 


A less obvious fact is that a positive operator has a positive square root as well 
as a positive pth root for any p >1. The reader can construct a proof of this by 
carrying out the steps indicated in Problems 12 and 13 in Exercise 11.3 below. 

Formula (11.3.15) implies that if T is Hermitian and S is positive and 
commutes with T, then ST? = TST = Τ is positive for 


(TSTx, x) = (STx, Tx) = (Sy, y) 2 9, y= Tx. (11.3.16) 


F. Riesz observed that a positive operator S can be written as a sum of squares. 
He proved 


Theorem 11.3.2. If S is positive and if, for \|x|| = 1, sup (Sx, x) < 1, then with 
Ay= δ. Anai = 4, — An’s (2 ee (11.3.17) 


it is found that for each n the operators A, and I — A, are positive and 


S= ) A? (11.3.18) 
n=1 


in the sense that 


Σ (4,7 χ, x) = (Sx,x), ΥΧ. (11.3.19) 


n=1 
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Remark. The assumption that sup (Sx, x) « 1 for ||x|| = 1 simplifies the formulas 
and implies no restriction of the generality. If it is not satisfied at the outset, we 
simply divide S by the appropriate supremum and then proceed as indicated. 


Proof. It is clear that A, = S and I — A, are positive operators. We then 
proceed by induction. For 2 >1 


Aysi = A,’ (I = A,) ate A,U o A,)’, oe An+1 = ( ὯΝ A,) Τ' A,’, 


1.6. by (11.3.16) the operators in the left members are sums of positive operators if 
A, and I — A, are known to be positive. Since 


S = 4,7 + 4,7 Ὁ. + Ay + Anes 


is the sum of positive operators, we see that 
Σ᾽ (A)2x, x) < (Sx, x) (11.3.20) 
k= 1 


for all m. Thus the series in the left member of (11.3.17) is convergent for all 
x € §. That the sum of the series is (Sx, x) follows from the fact that (A,?x, x) = 
|| A,x||?_ goes to zero. §j 


From this result we get 


Theorem 11.3.3. If S and T are positive operators which commute, then ST is 
also positive. 


Proof. Again supposing sup (Sx, x) <1 for ||x|| =1 we have the representation 
(11.3.18) and hence also 


(STx,x) = Σ᾽ (A,?Tx,x), (11.3.21) 
n=1 


where every term of the infinite series is non-negative. | 
Let us also remind the reader that if T 1s Hermitian and 
a = inf {t;te W(T)} < sup {t; te W(T)} = ὃ, (11.3.22) 
then the Hermitian operators 


T—al and bI-T (11.3.23) 
are positive. 
We can now take up the main mapping theorems. 


Theorem 11.3.4. Suppose that the scalar polynomial P(t) is positive in the 
interval [a, b] and that T is a Hermitian operator satisfying (11.3.22), then the 
operator P(T) is positive. 


Proof. By assumption P(t) has a representation by an element of P*(a, δ). 
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Suppose that (11.3.9) is this representation where the coefficients c, > 0. Then 


m 


P(T) = Σ᾽ ¢,(T — al)‘ (BI — ΤῊ. ". (11.3.24) 
k=0 
Here the factors T — al and bI —T are positive operators which commute, so 
every term in the representation is a positive operator. Multiplication by positive 
numbers and adding preserves the positive character. ἔ 


Corollary. If P and Q are two scalar polynomials such that P(t) « Q(t) 
for a<t<b and if T is a Hermitian operator satisfying (11.3.22), then 
Q(T) — P(T) is a positive operator or, in other words, P(T) < Q(T) in the 
partial ordering. 


This gives 


Theorem 11.3.5. If F is positive and continuous for a<t<band if T is a 
Hermitian operator satisfying (11.3.22), then there exists a unique operator F(T) 
which is positive. 


Proof. Since F is continuous and positive we can find an increasing sequence of 
positive polynomials {P,(t)} which converges uniformly to F(t) in [a,b]. This 
follows from BernStein’s approximation theorem together with the discussion leading 
up to formula (11.3.14). If max F(t) = K, the sequence {K — P,(t)} is also made 
up of positive polynomials which decrease to K — F(t). We also consider the two 
sequences {P,7(t)} and {K* — P,?(t)}. The first increases to the limit F7(t), the 
second decreases to K* — F*(t). To these four sequences correspond four 
sequences of positive operators, 


{P,(T)},  {KI—P,(T)},  {P,2(T)}, and {K?I — P,?(T)}. 


The first and the third are increasing, the second and the fourth decreasing in the 
sense of the partial ordering. This implies corresponding monotonic properties for 
the associated Hermitian forms. Since 


0 < (P,?(T)x, x) < K(x, x), (11.3.25) 

it is seen that the sequence 
{(P,,(T)x, x)} (11.3.26) 
is bounded and non-decreasing for fixed x with ||x|| = 1. Hence it has a limit < Κ΄. 
Further, form <n 
P,,?(t) < P(t) P(t) < P,’(t), Vt, a<t<b, 
and this implies 
(P,2(T)X, X) < (Pp(T) P,(T)x, x) < (P,?(T)x, x), 

and, as m, n > οὐ, all three members converge to the same limit. This in its turn 


implies that | 
(LP,(T) — Pr(T)1°x, x) = ||P,(T)x — Py(T)x||* > 0. 
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Thus {P,(T)x} is a Cauchy sequence in § for each χε §. Moreover, ||P,(T)|| is 
bounded by Καὶ for all n. Hence P,(T) converges strongly to an element F(T) of 
€(H). 

We have shown that starting with a Cauchy sequence P,(t) in C[a, b] which 
converges to F(t) we obtain an operator F(T). This operator, however, might 
conceivably depend upon what particular Cauchy sequence we start out with. That 
this is not the case is seen as follows. Suppose that {Q,(t)} is another monotone 
increasing sequence of positive polynomials converging to F(t) uniformly in 
La, δ]. To this sequence corresponds.an operator sequence {Q,(T)} and a limit, the | 
operator G(T). It should be shown that F(T) = G(T). To this end we need the 
following simple observation. For each m we can find an n > m such that 


Ρ,. (1) -- 2." « Ο,(ἡ, Vz, 


On(t)-2"<P(t), Vt. 


(11.3.27) 


Note that if one integer n will do for the first inequality, then any ee n will also 
do. This implies for each x with [|Χ| - 1 


((Pn(T) — 2. "χ, x) < (Q,(T)x, x) 
(Ρ,. (Τ᾽ χ, x) — 2." < (Q,(T)x, x) 


(F(T)x, x) < (G(T)x, x). 


or 


so that for m— oo 


The opposite inequality follows in the same manner and from 
(F(T)x, x) = (G(T)x, x) 


it follows that F(T) and G(T) are identical operators. Thus the mapping 
F(t) «» F(T) is unique. Ε 


So far F(t) was restricted to be positive on the interval [a, b] but the extension 
to arbitrary real or complex-valued continuous functions follows the usual pattern. 
Write 

F(t) = F,(t) + iF,(t) — F3(t) — iF ,(t), (11.3.28) 


where each F(t) is non-negative. Then F;(T) is a uniquely defined element of 
(9) and we define 


F(T) = F,(T) + iF,(T) — F3(T) — iF,(T). (11.3.29) 


Further extensions to discontinuous functions such as step functions are 
feasible. A step function S(t) can be approximated by an increasing sequence of 
continuous functions each of which can be approximated from below by poly- 
nomials with any desired degree of accuracy. We can thus select an increasing 
sequence of polynomials {P,(t)} which converges boundedly to S(t). The cor- 
responding sequence {P,,(T)} converges to a bounded operator S(T). In particular, 
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if S(t) is the characteristic function of an interval J, i.e. S(t) is | or 0 according as 
te J or not, then S(T) is a projection operator since S(t) = S(t). Here S(T) is 
the zero operator if J [a,b] = 9. 
To fix the ideas, suppose that 
a<0<b (11.3.30) 


and consider the characteristic functions of the intervals [a, 0) and [0, 5). Denote 
the corresponding operators by Ε (Τὴ) Ξ Ε΄ and Εἴ (Τ) Ξξκ Ε΄, respectively. 
They are projection operators and add up to the unit operator 


E-(T) + E*(T) =I. (11.3.31) 


They commute with each other and with T as well as with all F(T) constructed 
above. We set 
ΤῈ (ΤἸΊΞ:. TE'(T)=T"™. (11.3.32) 
Then ; 
T-+T'=T (11.3.33) 


and the following properties hold. 


Theorem 11.3.6. Τ is a negative operator, T* a positive one. Any element 
x € § such that Tx = 0 satisfies 


Ex=0, £°x=x (11.3.34) 


Proof. To prove that T* is a positive operator we have to prove that 
(T *x, x) > 0 for all x. We observe that T* is the image of the function t > t* = 
max (0, 2). This function is nowhere negative. Thus for any given ὃ > 0, an 
increasing sequence of polynomials {P,(t)} converging to {7 must satisfy 
P(t) > —6 for n> N, and this implies P,(T) > — oI and 


(P,(T)x, x) > —6(x,x), Ns <4. 
Hence 
(T *x, x) z — 0(x, x) 


and (T *x, x) > 0 since 6 is arbitrary. T~ is handled in the same manner. For the 
proof of (11.3.34) we need a partial generalization of the spectral mapping theorem. 


Lemma 11.3.1. If A is Hermitian and if AXo = A9Xo where ||Xo|| = 1, then for 
any function F, continuous on [a, δ] or the limit of continuous functions, we have 


F(A)Xo = F(A) Xo- (11.3.35) 


Proof. By the “‘classical’’ spectral mapping theorem, more precisely the fine 
structure theorem, this holds for polynomials, and hence for any F which is the 
limit of polynomials on [a, 6]. ΠΥ 


Continuation of the proof of Theorem 11.3.6. We now return to (11.3.34). If 
Tx, = 0, then Ay = 0 in the preceding lemma. If F~ and ΕΠ are the scalar functions 
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corresponding to E~(T) and E*(T), respectively, then 
E-(T)x) = ΕἼ (Οὐχ, = E* (T) xo = F*(0)Xp. (11.3.36) 


We now note that as long as we are working with bounded Hermitian operators, 
that is bounded intervals [a,b], then Ε΄ (0) and F*(0) are independent of the 
choice of A in the lemma. But if A is the zero operator, then 


ΕΓ (0) = 0, E*(0) =I. (11.3.37) 
Substituting this in (11.3.36) we see that 
F (0) = 0, Ft (0) = 1, (11.3.38) 


and this gives (11.3.34). a 


F. Riesz refers to Theorem 11.3.6 and the preceding formulas as the 
‘“splitting theorem’ (= Zerlegungssatz). From it the spectral resolution is 
obtained by applying the theorem to the family of operators T — tI, — οὐ <t < o. 
Set 

E(t)= E(T — tl). (11.3.39) 


This defines a family of operators with properties most of which were given in the 
preview earlier in this section. 


Theorem 11.3.7. The operators E(t) are projections for each fixed t, the zero 
projection for t < a, the identity operator for b < t. Further, E(s) commutes 
with E(t), with T and with any admissible F(T). Fors <t 


E(s) < E(t), E(s) E(t) = E(s), (11.3.40) 
and E(t) — E(s) is a projection. E(t) is left-continuous, 
lim E(s) = E(t). (11.3.41) 
Finally, ue 
(tI — T)x = 0 implies E(t)x = 0, [I — E(t)]x =x. (11.3.42) 


Proof. That E(t) is a projection follows from the fact that Ε΄ (A) is a projection 
for any choice of A, in particular for A = T — tI. Similarly, (11.3.42) is just a 
special case of (11.3.34). The commutativity properties are obtained by observing 
that-E(s) commutes with 5] — T and hence with 11 — T, which commutes with 
E(t). In fact, any two operators F,(T) and F,(T) commute since the polynomial 
operators commute. | 

The second formula under (11.3.40) is equivalent to 


[J — E(t) ]E(s) = 0. 
Denote the left member by P. It is obviously a projection. It should be noted that 
E(s)P=P, [I—E(t)]P=P. 
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The splitting theorem asserts that for any Hermitian operator A, the operators 
ΑΕ (A) and AE*(A) are negative and positive operators respectively. We apply 
this observation to the two operators T — 5] and T — tl with s «1. Further, for 
any x € § we form y = Px. Then 


By the special choice of y these formulas simplify to 


((T—sDy,y)<0, ((T-tDy,y) >0 
Or 


t(y,y) < (Ty, y) < sly, y). 


Since s < ¢ this holds iff (y, y) = || Px||7 = 0. Hence Px = 0 for all x and P is the 
zero operator. 
This gives 
[E(t) — E(s)]* = E?(t) — 2 E(s) E(t) + E’(s) 
= E(t) — 2 E(s) + E(s) 
= E(t) — E(s), 
so E(t) — E(s) is indeed a projection. Since projections are positive operators, 
E(s) < E(t) as asserted. 
The left hand continuity also follows from the splitting theorem. Since E(s) 
is non-decreasing, 


lim E(s) = E(t — 0) (11.3.43) 
571: 


exists in the strong sense. Set 
A =A, = E(t)— E(s), Ap = E(t) — E(t — 0). 
It is clear that A, is Hermitian and a projection. It should be proved that Ay = 0. 
From 
E(t)A =A, E(s)A = 0, (11.3.44) 
it follows that 
E(t) Ao — Ao, E(t - 0) Ao = 0. (11.3.45) 


The splitting theorem gives 
(E(t)(T — tl) Ax, Ax) > 0, (LI — E(s)](T — sl) Ax, Ax) < 0, 
so that by (11.3.44) 
((T — tl) Ax, Ax) > 0, ((T — sI) Ax, Ax) < 0. 
Letting 57 1, we see that Ax — Apx and the resulting double inequality implies the 


equality 
(T ΔΟΧ, Aox) -- t(Aox, AoX). 


Here we use the Hermitian character of Ay and its being a projection commuting 
with T to obtain 
(T Apx, X) = t(Apox, x). 


352 INNER PRODUCT SPACES 11.3 


Since this holds for all x we must have 
T(Aox) = t(Apox). 


This asserts that y = ΔῸΧ is a characteristic vector of the operator T corresponding 
to the characteristic value 1 = ἡ. We now invoke Theorem 11.3.6 with T replaced 
by T — ¢]I to obtain 

0 = E(t)y = E(t) Aox = Δοχ 


by (11.3.45). Since this holds for all x we see that A, = 0. | 


So far T was supposed to be a bounded Hermitian operator. F. Riesz considered 
also the case of unbounded operators. Suppose that T is a closed operator with 
domain D(T) dense in § and R(T) < §. Suppose that 


(Tx, y) = (x, Ty) (11.3.46) 


for all x and y in D(T). We still refer to 7 as Hermitian. 
The basic fact in the study of such operators is the observation that the 
equation 
(1 — T)y =x (11.3.47). 


has a unique solution y for each x € §. In other words, R(i, T) exists as an element 
of &(H). This gives as a necessary and sufficient condition for, Ζε D(T), the existence 
of an ΧΕ Ὁ such that 

z= R(i, T)x. (11.3.48) 


F. Riesz could extend his splitting theorem to such operators T and he showed 
the existence of a spectral function E(t) with essentially the same properties as in 
the bounded case. Since the spectrum of T is now an unbounded real point set, the 
operator E(t) is not constant outside of a finite interval [a,b]. Instead we have the 
limit relations 

lim E(t)x = 0, lim E(t)x = x (11.3.49) 
tl — t Tc 
for all x. Thus E(t) converges strongly to Ὁ and J as t> —oo and +0, 
respectively. 

The other properties of E(t) are preserved. Thus the spectral function satisfies 
(1) E(t) is a projection operator for each ¢, (ii) E(s) commutes with E(t), (iii) for 
5s <t, E(s) < E(t), (iv) E(s) E(t) = E(s) for s < t, (v) E(t) — E(s) is a projection, 
and lim E(s) = E(t). We shall refer to these results as Theorem 11.3.8. It is given 

stt 


without a proof. 


EXERCISE 11.3 


1. Verify (11.3.13) and that the discriminant of the polynomial in m is positive for 
sufficiently large values of p. 


2. Fill in missing details in the proof of Theorem 11.3.2. 


11.3 OPERATIONAL CALCULUS 353 


3. If sup (Sx, x) > 1, how should the formulas and the proof be modified in Theorem 
11.3.2? 


4. Verify (11.3.31) and the commuting properties claimed for these operators. 


5. Verify (11.3.37) and show that F~ (0) and F*(0) are indeed independent of the 
choice of A under the stated conditions. 


6. Fill in missing details in the proof of Theorem 11.3.7. 


7. Show that the equation (iJ — T) y = 0, 7 Hermitian, has y = 0 as its only solution, 
so that the solution of (11.3.47) is unique if it exists. 


8. In equation (11.3.47) the domain of iJ — T is D(T). Show that the range R is dense 
in §. [ Hint: If not, there would exist ΖΕ §, ||z|| = 1 such that z is orthogonal to 
all of R. This leads to a contradiction since (Ty, y) is real. Details?] 


9. If {x,} «- § is a Cauchy sequence with x, — xp, show that {y,} and {Ty,} are also 
Cauchy sequences, so that by the fact that T is a closed operator x) € R and R = §H. 


10. Show that (11.3.47) has a unique solution y for all x and verify that y = R(i,7)x 
and that R(i, 7) € (9). 


11. Prove (11.3.49). 


12. Suppose that 7 is a positive operator, 7 ε &(H), and that sup (7x, x) = M for 
|x|| =1. Define 


00 1 
S,= > (- 1" ( Ξ ) at" cutr —T)". 
n=0 n 
Show that (1) the series converges in the uniform operator topology, (2) S, is 
positive, (3) sup (S,x, x) = M? for ||x|| = 1, and (4) S,? =T so that S, is a positive 
square root of 7. 


13. In the preceding problem replace 2 by p, 4 by 1/p, where p is a positive integer. 
Carry through a similar argument with a view of showing that δ is a positive pth 
root of T. 


14. Let {A,} be any sequence of real numbers and suppose that 
dX a,e*"" = f(t), 
n=1 


where the series is uniformly convergent in (— 00, 00). Consider the space of all 
such functions and adjoin to it the limits of uniformly convergent sequences { f,()}. 
Show that the so completed space is a Hilbert space under the inner product 


(4,9) = lim (Δα) Ἱ f(t) σ(ὴ dt. 


If A is any real number show that exp (Ait) Ε § and that {exp (Ait); -- οὐ « 2 < +00} 
is a non-denumerable orthonormal system. Is Ὁ separable? This is the space of 
almost periodic functions in the sense of Harald Bohr (1887-1951). 
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11.4 THE SPECTRAL THEOREM 


We have now the basic properties of the spectral function E(t) corresponding to a 
Hermitian operator T which is closed and has its domain D(T) dense in . We 
have to define and discuss the Stieltjes integrals which figured in formula (11.3.7) 
and which form the core of the Spectral Theorem for Hermitian operators. 

Our point of departure is the operator 


A= A, = E(t) — E(s), s<t | (11.4.1) 


introduced in the proof of Theorem 11.3.7. By Theorem 11.3.8 its properties are 
essentially unaffected by the passage from a bounded to an unbounded Hermitian 
operator. We note first that (T — ΠῚΔ is a negative bounded Hermitian operator 
while (T — sI)A is positive and bounded. It follows that 


AT = TA 
is also bounded and Hermitian. Further, 
s(Ax, x) < (TAx, x) < t(Ax, x). 
If now 5 < A < t, we have 
((T — AD) Ax, x)| < (t — 5) (Ax, x) = (t — s)|Ax(]?, 


so that 
|AAx — TAx|| < (¢ — s)||Ax|]. (11.4.2) 


Next we recall that the operators A,, are projections, so that 
(An)? = Ad (11.4.3) 


Moreover, if (s,,5,) and (f¢,,¢,) are non-overlapping intervals having at most an 
endpoint in common, say 
δι. <8, St, <h, 
and if A, and A, are the corresponding projections, then 
A, A, = [E(s2) — E(s,)] LE(@2) -- Ε()] | 

= Ε(52) E(t2) — E(s2) E(t,) — E(s1) E(t) + E(s1) E(t,) 

= E(s,) — E(s2) — E(s,) + ΕΟ) = 0, 
so that the projections are orthogonal. 

We now consider a partition of the real axis into a countable number of 

intervals (s,, t,.) with no clusterpoint in the finite domain. Here 4, = 5,4, for all k. 


Set 
A, = E(t,) — E(s,). (11.4.4) 


The projections {A,} form a complete orthonormal system in the sense that 


A,” “-Ξ: Δι, A, Ay ΞΞ 0. J τέ k, (11.4.5) 
and 
rA, = 1, (11.4.6) 
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where the series on the left converges in the uniform operator topology. Then if 
x € § we set 
x, = Δεῖ, ΣΧ, = x (11.4.7) 
and note that 
(x;,%,) =0, =X |xll? = xl? (11.4.8) 
since 
Σ ||x,||7 = Σ (A,x, A,x) = Σ (A,x, x) = (x, x). 


If now χε D(T) we set Tx = y and note corresponding representations for y. 
If y, = A,y, then 
y, = Ayy = A, Tx = TA,x = Tx, 
and 
Tx = x Tx,, || Tx, |]? = || Tx||?. (11.4.9) 


The convergence of the last series is necessary and sufficient for x e D(T). Suppose 
this condition is satisfied and let us show that xe D(T). For j#k 


(TA,) (TA,) = T(A,TA,) = T?(A,A,) = 0. 


Thus the elements Tx, = TA,x of Ὁ are orthogonal and 


2 
TX, 


πεῖς |Κ <p 


ITxI7, 


n+1<|k|]<p 


where n < p. By the assumed convergence of the second series in (11.4.9) the last 
display tends to zero 88 ἢ — 00. It follows that the first series in (11.4.9) converges 
to a limit, z say. If now ἃ is any element of D(T) we have 


(Tu, xX; + X2 +--+ + x,) = (u, Tx, + Tx,  --- + Tx,). 


As n-— οὐ the left member converges to (Tu,x), the right to (u,z), so that 
(Tu, x) = (u,z) and z = Tx since T is Hermitian. Hence χε D(T). 

Suppose that the partition of the real axis is so fine that all intervals are at most 
6 in length where ὃ is a given positive number. Let s, < A, < ¢, for all k and let 
xéD(T). Then with the notation as above 


| Px, — AX; || = |] TA — A,A,x|] < ( — 54) Xl] < δίχα]. 
so that 
d || Tx, — A,x;,||* < 57||x|]?. (11.4.10) 


Consider the three series 
ΣΤΤΧ, — AX], ET, LAX (11.4.11) 


The first and the second series converge in norm to elements of 55, the sum in the 
second case being Tx. It follows that the third series also converges in norm so that 
the convergence of 


LA,? ||x, 1]? = LA,?(Azx, x) (11.4.12) 
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is necessary and sufficient for xe D(T). By (11.4.10) for any ye 
(Tx, y) — Σ A,(A;x, y) I] < 4]|xIl lly. (11.4.13) 
Now to any positive 6 we have Riemann-Stieltjes sums _ 
S,(6) = Σ λει (E(4,)x — E(s)x; x), 
S,(6) = 2A, (E(t,)x — E(s,)x, y), 


and by the preceding estimates (11.4.10) and (11.4.13) any sequence of such sums 
with 6 | 0 converges to 


| s’d(E(s)x,x) and | sd(E(s)x, y), 
respectively. Thus we have proved 


Theorem 11.4.1. A necessary and sufficient condition for x to belong to D(T) 
is the convergence of 


| s*d(E(s)x, x), (11.4.14) 
and for any such x and any ye Ὁ 


(Tx, y) = [΄ sd(E(s)x, y). (11.4.15) 


These are ordinary Riemann-Stieltjes integrals. With §-valued integrals we 
have 


Tx = [᾿ sdE(s)x (11.4.16) 


for any x satisfying (11.4.14). If T is a bounded operator, then this condition 15 
trivially satisfied for all x € §. 

Conversely, suppose that an operator function E(s) ε (9) is defined for all 
values of 5, -- οὐ < 5 < +00, and that E(s) has the properties of a spectral function 
as stated in Theorem 11.3.8. Then a linear operator T is defined by (11.4.16) for 
all x satisfying (11.4.14). It may be shown that T is Hermitian and that E(s) is its 


spectral function. 
We can generalize Theorem 11.4.1 in various ways. A simple extension is the 


following. Suppose that F(s) ε C[— 00, oo], then 
Ι΄ F(s) d(E(s)x, y) (11.4.17) 
is well defined for all x,y Ee. We set 
F(T)x = ] ” F(s) d(E(s)x). (11.4.18) 


This defines a linear bounded operator F(T) ε (9) which is Hermitian. This 
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definition represents an extension of the Operational Calculus of Section 11.3 to 
unbounded operators. It will be proved below that for a bounded operator the two 
definitions coincide. 


Theorem 11.4.2. If Fe ([-- οὐ, 0] andxe D(T), then 


F(T)Tx = TF(T)x = | ” sF(s) d(E(s)x). (11.4.19) 


Proof. Since χε D(T) and F is bounded by its C-norm ||F||¢ we have 


Ι΄ s*|F(s)|? a EG)x, x) ΞἼ ΕΙΣ | © 8? d(E(s)x, x), 


which exists by (11.4.14). It follows (by the Bounyakovsky—Schwarz inequality) 
that the integral in (11.4.19) exists. On the other hand, F(T) is bounded and 
xé€D(T), so that F(T) Tx exists and is the limit of Riemann-Stieltjes sums (with 
notation as above), 


Σ F(A,) LE(t,) Tx — E(s,)Tx] = ΤΊΣ F(A,) LE(t,)x — E(s,)x]}. 


For a sequence of increasingly refined partitions, the sum on the left converges to 
F(T)Tx while the sum on the right converges to F(T)x. Since T is closed, this 
gives | 

[F(T) T]x = TLF(T)x] 


ἴογ χε D(T). Thus F(T)x ε D(T) and the operators T and F(T) commute on 
D(T). 
Next, note that 
Σ F(A, LEC) — ECs) (Tx -- Ax] = 2 FC) LT — Axi]. 
The square of the norm of the last expression does not exceed 
WF llc” Z| Tx, — Axel? < Cll Fille [|Χ|}}" 


by (11.4.10). It follows that 


| sF(s) d(E(s)x, y) -| F(s) d(E(s)Tx,y), Vy. 
Here the last expression is (F(T)Tx, y) and (11.4.19) holds. i 
This implies 


Theorem 11.4.3. If T is a bounded Hermitian operator, then the operator F(T) 
defined in Theorem 11.3.2 for continuous functions F(t) coincides with that 
defined by (11.4.18). 


Proof. We start by proving the identity for powers of T. If the spectrum of T is 
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confined to the finite interval [a,b], we have by (11.4.16) 
Tx = [ sex) 
More generally, by (11.4.18), 
T"x = [ “ΚΕ. (11.4.20) 


To justify this we note that for 


a”, s<a, 
F(s) = e a<s<b, 
ὑ", b<s, 


formula (11.4.18) makes sense and gives precisely (11.4.20). Hence for any 
polynomial P(s) we have 


P(T)x = [ 20 d(E(s)x). (11.4.21) 


Now if F(s) € Cla, δ] we extend the definition by setting F = F(a) for 5 < a and 
F = F(6) for ὦ < 5. The so extended function is an element of C[— οὐ, oo] and 
formula (11.4.18) applies and gives 


F(T)x = [ F(s) d(E(s)x). 


On the other hand, we can find a sequence of polynomials {P,(s)} converging 
uniformly to F(s) in [a,b]. From 


P.(T)x = [ Ρ, (5) d(E(s)x) 
we get ; 


lim P,(T)x = lim [ P,(s) d(E(s)x) 


no 


= Ϊ ᾿ lim P,,(s) d(E(s)x) = [τω d(E(s)x). 


| The first member is F(T)x as defined by Theorem 11.3.2, while the last member is 
F(T )x according to (11.4.18). Thus the two definitions coincide whenever they are 
both meaningful. Ε 


We call attention to two special cases of formula (11.4.18). 


Example 1. Wet A be a complex number, ὁ = p + iv, y # 0, and set. 


Fe) = 
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so that © | 
F(T)x = —— d(E 
(Tx = |" ——a(E(o)x) 
If now xe D(T), 
TF(T)x = F(T)Tx ~| ——a(E(s)x) 
and a oe 
(AI — T)F(T)x -| 7 HEX) =X: 
ee has 
and this relation holds for all x. On the other hand, 
F(T)QAI — T)x = x 


holds only for x ε D(T). Thus we see that the resolvent of T exists for all non-real 
values of A and 


Κα, T)x = Ϊ ᾿ —a(E(s)x) (11.4.22) 


SA = 8 


For the existence of R(A9, T) for a real A, it is necessary and sufficient that there is 
an interval (Ay — 6, Ag + δ) in which the spectral function E(s) keeps a constant 
value. 


Example 2. Take 


F(s) =" 
ἱ - 5 
and define ay es 
F(T)x = C(T)x -| τς UEG)x), V x. (11.4.23) 
<1 -- 
For χε D(T) 


(il — Τὴ C(T)x = ix + Tx, 


so that C(T) is the Cayley operator (see Problem 16, Exercise 11.1). 
We can generalize still further. Formula (11.4.18) stays meaningful as long as 
the integral exists. This will be the case if F(s) ε C(— οὐ, οΟ) and x is so chosen that 


[ |F(s)|? d(E(s)x,x) exists. (11.4.24) 


EXERCISE 11.4 


1. What is the norm of the operator F(T ) defined by (11.4.18)? 
2. Use (11.4.23) to show that C(7) is unitary. 
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. Take § = L,(—7,7), set f(t) ~ Xf,e"", and let T = —i(d/dt). Determine D(T). 


Show that o(7) is the set of all integers and that the spectral function E(s) is the 
operation of taking partial sums of an unconventional kind so that 


(E(S) ff) = ΣΙ for n<s. 
Find (Tf, g) if fe DIT), gE §. 


. With the same choice of §, take Tf = —/f"’ and carry through a similar discussion. 


. If T is Hermitian with spectral function E(s), show that 


exp (— itT)x = | ᾿ e” *d(E(s)x) 


is meaningful for all x if — 00 < t < οὐ. Find its inverse and show that the operator 
is unitary. 


. Show that for ὁ = pw + iv, v > 0, for all x, 


-- ‘| e'* exp (— itT )x dt 
0 


makes sense and equals R(A, 7 )x. What is the corresponding formula for v < 0? 


. If E(s) is constant in some interval [«, 8), show that fora < 1 < β formula (11.4.22) 


makes sense and represents R(A, 7 )x. In particular, if T is bounded with spectrum 
confined to [a, b], show that the formula represents R(A, T )x for all A not in [a, b]. 


. Take § = L,(— οὐ, 00) and define the Dirichlet operator by 


© sin su 


DLSI@) = | futtjdu, s>0, 


πο. Ἢ 

and let D, be the zero operator for 5. « 0. Show that D, is a spectral function. 
It corresponds to the operator 7 which takes f into the conjugate of its derivative 
where “‘conjugate” is taken in the sense of potential theory, so that the conjugate of 


g is 


2 Ls. du 
g(t) = ——lim gut+t)—. 
TM 510 J|ul>d u 


. Let {@,(t)} be a complete orthonormal system for L,(— οὐ, 00) and define 


ALS I(t) ~ 2 (—)"7,0,(02) if f@) ~ LInOn(t)- 


_ Show that this is a bounded normal operator. Find its spectrum and the canonical 


decomposition A = B + iC where B and C are Hermitian. Find the spectral func- 
tions corresponding to B and to C. If the set {@,(¢)} is the system of normalized 
weighted Hermitian polynomials, then A[ f'] is the Fourier transform of Καὶ 
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10. Let {c,; 2 =1, 2, 3, ...} be a strictly decreasing sequence of positive numbers such 
that °° nc,” < 00. Define a transformation T on /, by 


οΌ 
TX = {γ,, ).».-.»}),.»...)}) Where y, = Σ᾽ cx. 
k=n 


Show that 7 is linear and bounded and that its spectrum is the set {c,}. Is T Hermi- 
tian or normal? If 7 is taken as a mapping from /, to /,, and if also }’,°c, < οὐ, 
show that the point spectrum covers the whole plane less the origin. 


11. Show that a Hilbert space is separable iff it contains a complete orthonormal 
system. 


12. For a Hilbert space and a Hermitianoperat or A show that the Landau—Kallman— 
Rota inequality (5.3.22) may be sharpened to || A(x)||* < ||x|| ||A7(x)]|. 
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12 FUNCTIONAL INEQUALITIES | 
FUNCTIONS OF A SINGLE VARIABLE 


In several branches of analysis functional inequalities enter in a decisive manner. 
They underline uniqueness theorems and they characterize important classes of 
functions. Moreover, it is often possible to describe functions satisfying a given 
functional inequality in terms of the solution of an associated fixed-point theorem. 
In this and the following chapter we shall discuss two different types of 
functional inequality involving functions of a single variable and functions of two 
variables, respectively. For the first type we consider the canonical form 


f(t) < TLSI@), 


where T is a mapping from a given metric space of functions f into itself. 
Here partial ordering is often a useful tool. 

There are five sections: Classification; Some determinative inequalities; Use 
of fixed-point theorems; Some applications; and Remarks on a class of 
multiplicative inequalities. 


12.1 CLASSIFICATION 


Let X be a complete metric space the elements of which are mappings from some 
interval [a,b] into Ε΄. Further properties of X¥ and of its elements f will be 
specified as need arises. Let T be a mapping of & into itself and consider the 
inequality 


(ἢ « ΤΙ], Vtela, δ], (12.1.1) 
or 
7 « ΤΙ. (12.1.2) 
if partial ordering is introduced in 3 in the natural manner, so that 
h<h iff AO<AM, Yt. (12.1.3) 


There exists a subset X, of ¥ the elements of which satisfy (12.1.2). The 
nature of the subset will lead to a useful classification of the inequality. 


(1) ¥, = ἃ. In this case the inequality is trivial for ¥. As an example, take 
X = C[a, δὴ and 
TLS = [3 + £41, 


f(t) <3 + ΧΟ 1. (12.1.4) 
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so that the inequality reads 
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Now the quartic 

y=x*-—4x+3 
has a unique minimum for x =1 with y=0O and γ' «0 for x <1 and 
> 0 for x >1. It follows that y is never negative and this implies that (12.1.4) 


holds for every feC[a,b]. Thus the inequality is trivial, which does not 
impair its usefulness. 


(2) ¥, is a non-empty proper subset of ἃ which contains more than one 
element. Then (12.1.2) is said to be restrictive. 

An example is given by X = C[a, b] restricted to the real-valued elements 
and T[f] =f, the inequality being 


fO< MY. (12.1.5) 


There are obviously elements of ¥ which satisfy this inequality. A sufficient but 
not necessary condition is that the range of f omits the interval (0,1). On the 
other hand, there are just as obviously elements of X which do not satisfy 
(12.1.5). Hence Xp is a proper non-void subset of X containing more than one 
element (actually infinitely many). Thus (12.1.5) is restrictive in our classification. 


(3) If ¥, reduces to a single element, the inequality is said to be determinative. 
The next section is devoted to determinative inequalities. Here we give a fairly 
trivial example. Take X = C*[0, 1] and the inequality 


fO<tlhfOY, O<t<i. (12.1.6) 


This has a unique solution in C*[0,1], namely f(t) = 0. Hence the inequality 
is determinative. 


(4) If X, is void, we say that the inequality is absurd (for the given space ἃ, 
enlarging the space may change the classification). 
Among the possible inequalities of this nature we select 


f(t)< -1-[f@Y, (12.1.7) 


X being the subset of real-valued elements of C[a, b]. There is no element of X 
which satisfies the inequality, so (12.1.7) is absurd in this setting. 


EXERCISE 12.1 


If -- Ct[0, 1] or a subset thereof as specified, classify the following inequalities. If a 
subset is specified, verify that it by itself is a complete metric space in the induced topology. 


1. f(t) < 1 -- [ΧΩ]: }, If] <1. 

2. ft) < 20fOV -- LFOV, Fil « 1. 
3. f(t) < sin [Lf], SI < π. 

4. f(t) < ἰδ [0], [|| < 42. 
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ΓΚ 12 
Ι- λ΄ 
6. Τὴ «τ + (FOP. 

7. ft) <[fM] -1 for ΤΠ εΟἾ[0,17. 
[fOr 
1+f(r) ° 
9. f(t) < fGo). 

10. f(t) <3 fGrt) + 4/0). 
11. f(t) < ΣΧ] with || f|| <1 where 


5. f(t) < 


8. f(t) < for T[f](@)¢C*[0, 1]. 


CO οο 
F(u) = dia,u", 0Κα, Vn, δ'α,Ξθ «Ἱ 
0 0 


and the radius of convergence of the power series equals 1. There are two cases 
according as dy = 0 or dy > 0. 


12.2 SOME DETERMINATIVE INEQUALITIES 


There are several inequalities of this type which play a decisive role for 
uniqueness theorems. In view of their importance we state them as theorems. 
We also list the resulting uniqueness theorems. 


Theorem 12.2.1. If f¢C*[0, 6] and if 


f(t)< K| fo) ds, O<t<b, (12.2.1) 
0] 
then f(t) = 0. 


Proof. We shall give two distinct proofs here, and other proofs will be found 
in Exercise 12.2 or as corollaries of later theorems. 


1. Proof by iteration. By substitution we get 
t f(s ft 
0<f(t)< KY Ϊ f(u) duds = KY] (t — 5) f(s) ds 
OVO 0 


by a classical reduction formula for multiple integrals. Repeating the procedure, 
we get 


I n ‘ n—-1 
0 « (ἡ) « Ξ Κ Ϊ (t — 5) f(s) ds. (12.2.2) 


Hence the sup-norm satisfies 


Is 1 <— (Koy 71. (12.2.3) 
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and for large values of n the multiplier on the right is <1. This requires 
| f || = Ο and hence f = 0. 


2. Proof by integration. Set 


F(t) = [7 ds (12.2.4) 
0 
and note that 
F(0) = 0, F(t) > 0, F’(t) = f(t). 
Hence 
Γ΄ (ἢ) — KF(t) < 0, (12.2.5) 


i.e. an integral inequality has been replaced by a differential inequality. Normally 
the latter type is more difficult to discuss than the former, but in the present 
case we reach the desired goal after multiplying both members by exp (— Kr) 
to obtain an exact derivative 


3 [F(t) exp(—Kzt)] < 0. (12.2.6) 


This shows that F(t) exp(— Kt) is non-increasing in (0, 6). Since F (0) = 0, this 
implies F(t)<0. But we already know that F(t) >0, so we must have 
F(t) = 0 and hence also f(t) = 0. i; 7 


This type of argument is quite useful in handling various types of 
functional inequality involving integration. 

Let us return for a moment to Theorem 6.4.1, where the uniqueness. proof was 
omitted. 


Theorem 12.2.2. The solution of a differential equation of type (6.4.5) with a 
Lipschitz condition (6.4.4) is unique and the same is true for a matrix equation 
(6.4.7) with the Lipschitz condition (6.4.6). 


Proof. This is an immediate consequence of Theorem 12.2.1. To fix the ideas, 
take the matrix case. If there are two solutions YU, and ¥, of the equation 


W(x) =F Ex, 6}, 0) =0, (12.2.7) 
valid for |x| < ro, then for0< x <r, 
IY(x) — ¥,(9) < . | FLs, Yo(s)] — FEs, ἡ. 1] as. 
By the Lipschitz condition 
20) — YI <K [ Yas) — Y,(s)I ds. 


This states that the continuous non-negative function | Y2(x) — Y,(x)|| batisfies 
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(12.2.1) with δ =r, and is consequently identically zero. The extension to 
(-- "0. 0) is trivial. ἢ 


We can generalize Theorem 12.2.1, replacing the constant K by an integrable 
function K(s) under the sign of integration. 


Theorem 12.2.3. If f © 0,8], if KeC* (0, b] ἡ 100, 8), and if 


f(t)< [Kos (s)ds, Vt, (12.2.8) 
) 
then f(t) = 0. 


A proof can be given by either of the methods used above in the case 
K(t) = K and will also follow as a corollary of later theorems. This theorem 
can be extracted from an existence theorem for differential equations due to the 
brilliant and colorful Turkish-Greek-German mathematician Constantin Cara- 
théodory (1875-1950) in 1918. His theorem is obtained if in the formulation of 
Theorem 6.4.1 the boundedness condition (6.4.9) be replaced by 


F(x, γ}} < K(xl) (12.2.9) 
and the Lipschitz condition by 


IFO, y2) -- FO, yl < KQxD lly2 — yill- (12.2.10) 


Such conditions permit the consideration of equations where the right member 
becomes infinite as x > 0, provided the infinitude is integrable. The uniqueness 
of the solution follows from the fact that if y,(x) and y,(x) are two solutions with 
the same initial value, then ||y,(x) — y,(x)|| satisfies (12.2.8). 

A uniqueness theorem found by Mitio Nagumo for differential equations in 
1926 is based on the following functional inequality, which is not a special case of 
Theorem 12.2.3. 


Theorem 12.2.4. Let ¥ be the subspace of C*[0, 6], the elements of which 
satisfy f (0) = 0, lim f(A)/h = 0. Then, if f € X and if 
hlo 
t ds 
7(ἢ « Ι! f(s) τ’ (12.2.11) 
0 
it follows that f(t) = 0. 


A proof may be based upon the integration method. The details are left to 


the reader. 
Nagumo’s uniqueness theorem is relevant if in Theorem 6.4.1 the Lipschitz 


condition is replaced by 


1 
F(x, v2) — FQ, y,) || < Tx lly2 — yall. (12.2.12) 


If there were two solutions, then the norm of their difference is a function f 
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satisfying the conditions of Theorem 12.2.4 and is hence identically zero. Thus 
the solution is unique. 
Other determinative inequalities will be encountered later. 


EXERCISE 12.2 
1. Verify Theorem 12.2.1 by a step-by-step argument. Start by showing f(t) = 0 if 
δῖ <1 and extend to longer intervals. 
If X = C*[0, 4], show that the following inequalities are determinative: 
2. f(t) < log [1 + f(d)]. 
3. f(t) <1 — exp[—f()]. 


4. With the same choice of X, for what values of the positive number a is 


f@®) < Ϊ L f(s) ]* ds 


determinative? Distinguish the two cases 0<a<1 and 1 <a which require 
different treatment. 


5. Prove Theorem 12.2.3. 
6. Prove Theorem 12.2.4. 
7. Verify that (12.2.12) implies uniqueness. 


8. It was said at the end of the proof of Theorem 12.2.2 that the extension to the interval 
(— ro, 0) is trivial. If it is trivial, it can be done. Do it! 


12.3 USE OF FIXED-POINT THEOREMS 


The reader may have noticed that the problems in the preceding part of the 
chapter frequently led to conditions related to the fixed point of the corresponding 
transformation T. Thus it turned out that the determinative inequalities could 
only be satisfied by the invariant element, which in Section 12.2 was f(t) = 0. 
But restrictive inequalities lead to similar considerations. Thus in Problem 4, 
Exercise 12.2, take a = 4 so that the inequality reads 


f@< [ [f(s)]? ds. (12.3.1) 


It is found that f(t) must satisfy 0 < f(t) < it*. In this case the transformation 
T defined by the right member of (12.3.1) has infinitely many fixed points, the 
extremal solutions being f(t) =0 and f(t) = 112. In the cases considered the 
underlying spaces would admit partial ordering. This suggests the existence of 


368 FUNCTIONAL INEQUALITIES I. FUNCTIONS OF A SINGLE VARIABLE 12.3 


general relations between functional inequalities and fixed-point theorems in 
partially ordered spaces when the transformation Tis order preserving. A typical 
result obtained from this starting point is 


Theorem 12.3.1. Let Δ be a complete metric space which is partially ordered 

in such a manner that if {x,} is an increasing sequence in X so that 

Xn <Xno1, for all n, and if lim x, = Xo exists in the sense of the metric, then 
na 0 

X_ Ξ Xo, for alln. Let T be an order-preserving mapping from X into X such that 

ΤΠ is a contraction for some m. Let fy be the unique fixed point of 1. 

Then 


f<T[f] implies f< fo. (12.3.2) 


Proof. We say that T is order preserving if for fy, fp ε ἃ 


fi «Δ implies TL] < TLSf2]. (12.3.3) 


Suppose that feX,, that is f<T[f]. Note that X_ is not void since it 
contains fo at least. Then: 


f<TUI<Tis<: «ΤΊ « --. (12.3.4) 
Here 
lim T(T(fl=foo s=9,1,++,m—I, 


since ΤΙ is a contraction and the limit is the same for all elements of X. 
See Section 5.4. It follows that the whole sequence, which is increasing, must 
converge to fy. The theorem is proved since order is preserved under the limit 
operation. ἢ 


Example 1. We take the inequality of Theorem 12.2.1 with 


TU f(t) = Καὶ | Oa (12.3.5) 


and ἃ = C*fa,b]. Here (5.4.10) and (5.4.11) show that T” is a strict contraction 
for all large values of m. Further, T is a positive operator; it takes positive 
elements of C[a,b] into positive elements, 1.6. C*[a,b] is mapped into itself. 
This implies that Tis order preserving. It is clear that the zero element is in ἃ 
and is left invariant by T, so it is a fixed point. Theorem 12.3.1 now gives 
f <0, which together with f>0 gives f=0 as the only element of X that 
satisfies (12.3.5). 

The same type of argument applies to the Carathéodory inequality (12.2.8). 
In these cases linear spaces and positive operators are involved. Such cases may 
be handled by the Volterra fixed point theorem, which gives 
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Theorem 12.3.2. Let X be a partially ordered B-space such that the positive 
cone X* is a closed point set. Let S be a positive linear bounded operator 
from 3 to X such that 

Σ᾽ ||S"I| < οο. | (12.3.6) 


Let g be a given element of X*, fy the unique fixed point of 
TLAJ=9 + SLSI]. | (12.3.7) 


f<g+S[f] implies Κὶ « ἢ. (12.3.8) 


Then 


Proof. ‘The iterates 


Tf] =9 + Sig] +--+ δ 1[σ] + STS] (12.3.9) 


form an increasing sequence since f < T[f], while g is positive and S is a 
positive linear operator and hence order preserving. By Theorem 5.6.1 


im ΤῊ] = fo. 


Combining this with f < T[f] and the order, preserving properties of limits, 
we get (12.3.8). J 


An important special case is the following. 


Theorem 12.3.3. Let KéC* (a,b) L(a,b). Let f and g belong to C*[a, δ] 
and suppose that 


f(t) < g(t) + [ Kosoas, Vte[a,b]. (12.3.10) 
Then 
FO <9(0) + | K(syexp| [κω du] o(s) as, (12.3.1) 
Proof. Here 
571 = | KS ()as, (12.3.12) 


which clearly exists as an element of C*[a,b]. Further, 


SL) = | K() | Kw Au) duds 


2 i) ᾿κω f(u) | i) 'Κῷ as| ap 
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It follows that 


Sol < ivf Kw | [ x)as| au 


= αὐ. [κα] ἢ 


Hence 1 b 2 
isi «τ {κα 
and by induction 
Ws" < - | | Ko 4 ὲ (12.3.13) 
Hence (12.3.6) holds. Thus 
1 t t n-1 
StI) = Sy [oxo | [ Kw)au]” as 


Summing for n, we obtain the right member of (12.3.11). ἢ 
Corollary. Theorem 12.2.3. 


For if g(s) = 0, so is the right member of (12.3.11) and f(t) = 0 results. 
Let us specialize again. If K(t) = K, then (12.3.10) becomes 


FO <a) + KI 7 ὦ (12.3.14) 
and (12.3.11) reduces to 
f(t) <g(t)+K ) GIKG =O1g@ee (12.3.15) 
If g(t) is also constant, then 
f(t)<g+t K| £ ds (12.3.16) 
implies 
f(t) < g explK(t — a)]. (12.3.17) 


The set of the three inequalities (12.3.11), (12.3.15), and (12.3.17) is known 
as Gronwall’s Lemma after the Swedish-American mathematician Thomas Hakon 
Gronwall (1877-1932), who found a special case in 1918. We shall consider some 
implications of this lemma in the next section. 

In this discussion and also in the preceding section we have imposed un- 
necessary restriction on the space ἃ. Thus in Theorems 12.2.1, 12.2.3, and 
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12.3.3 we could replace C*[a, 6] by L*(a, δ) with the inequalities holding for 
almost all 1. We can also generalize in a different direction. 

Consider the space M,C*[a,b] briefly mentioned towards the end of 
Section 3.2. This is the set of ali » by n matrices F(t) =(jf,,(t)), where 
fi(t)eC* [a,b]. Let K = (k,,), k,; > Ὁ be a positive constant matrix. Then 
Theorem 12.2.1 generalizes to 


Theorem 12.3.4. The zero matrix is the only element of %,C*[a, b] which 
satisfies 


F(t) < % | #6) ds,  Vtefa,b). (12.3.18) 


Proof. We recall that partial ordering is available in this space and both sides 
of (12.3.18) are positive matrices. As the norm in X we use 


\|¥ || = sup sup bo [}κ6 }}. (12.3.19) 
: 1 f= 
If T[F] denotes the right member of (12.3.18), then 


|T] < © — a) [090]. 
and, by induction, 


l 
IT" < LO — a) 1505, (12.3.20) 


so that T” is a contraction from X into X¥ for all large m. Thus T has a 
single fixed point. Now the zero matrix is obviously left invariant by T, so 
that it is the fixed point. Theorem 12.3.1 now gives 5 < 0, which, combined with 
92 9, ρίνε 5 =O. 


Similar generalizations may be found for the other theorems listed above. 
We end this discussion by noting a result for the transformation T2 where 
T is the operator of (12.2.1). 


Theorem 12.3.5. If f and g€C*[a, δ] and if 


f(t) < g(t) + K? { ἀκοῇ f(s) ds, (12.3.21) 


then ‘ 
S(t) < g(t) + K| sinh[ K(t — s)] g(s) ds. (12.3.22) 


The proof is left to the reader. 
Let us return briefly to the point of departure, namely Gronwall’s Lemma 
or Theorem 12.3.3. There we are concerned with the positive cone of the 
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Banach algebra C[a, b] and the operator 
t 
TUS ](@) -| K(s) f(s) ds, a<t<b. (12.3.23) 


T is a linear positive quasi-nilpotent element of the operator algebra €{C[a, b]}. 
J. B. Miller has called my attention to the fact that T is an antiderivation. This 
is a class of linear bounded operators from a B-algebra 8% into itself to which he 
has given much attention. The basic identity for an antiderivation G is | 


(Gx): (Gy) = G[(Gx):y + x: (Gy)], (12.3.24) 


where x, ye%. Miller has determined the resolvent of G and shown that the 
spectrum of G reduces to {O}. It follows that 


lim [6515 = 0. (12.3.25) 


n7>@ 


Suppose now that 8 is partially ordered and Ὁ is a positive operator from 
B to B, which is also an antiderivation. Consider the functional inequality 


f<g+ Gf, (12.3.26) 
where f, σε 8 and g is fixed. By Theorem 12.3.2 


f<gt Σ G"[g], (12.3.27) 


where the series is absolutely convergent by (12.3.25). This is the analogue of 
Theorem 12.3.3 for positive antiderivations. 


EXERCISE 12.3 


1. Verify (12.3.13). 

2. Show that Theorem 12.2.1 holds for L*(0, 5). Estimate 7” in this space. 
3. Verify (12.3.20). 
4. 


Extend (12.3.16) to the matrix case, ie. show that if Ξε M,C [a,b] and G and KH 
are fixed positive matrices, then 


t 
σῷ <G+K| Fd 


implies that 
F(t) < exp[ Kt — a)]S. 
Note the order of the factors: G and Ὁ need not commute. 


5. Prove Theorem 12.3.5. 


12.4 SOME APPLICATIONS 373 


6. It is desired to give an extension of Nagumo’s inequality analogous to Theorem 12.3.3. 
Let X be the space of functions f defined on [0, 5] with the following properties: 
(1) fe C*[0, 81, (2) lim f(t) = 0, (3) lime 7! £(¢) = 0, (4) 472 f(t) ε L(O, δ). Show 

10 10 


that X is a complete metric space under a suitable norm. Use the method of integra- 
tion [method (2) in the proof of Theorem 12.2.1] or other device to prove that if g 
and fe X, g fixed, and if 


ft) <g@) + [fos ds, 
0 


then t 
ft) « σ( + a) g(s)s~ 7 ds. 
0 


7. Extend the preceding result to the matrix case, 1.6. let X be the subspace of 
M,C *[a, b] where the elements satisfy conditions (2) to (4) with f replaced by a 
matrix 5. 

8. Let fsatisfy (12.3.1) and f(t) = 0,0 <t<a<b. Showthat0 < f(t) « 26 -- a)’, 
a <t <b. Show that the majorant is also a fixed point of T acting on C*[0, δ]. 


12.4 SOME APPLICATIONS 


Gronwall developed his lemma with a view to applications in the theory of 
differential equations. To illustrate the power of the lemma we shall discuss two 
such applications. 

The first concerns the change in the solution caused by a small change in 
the initial condition. To fix the ideas, consider the matrix case (see Theorem 
6.4.2) 

Y’(x) = F[x, Y(x)], (12.4.1) 


where # is bounded and satisfies a Lipschitz condition for |x| < a, {{ὕ}} < 6. 
We denote by , and Ὁ; the two solutions defined by 


¥,0)=0, 4Y,(0) = 30, (12.4.2) 


respectively, where for convenience we assume ||Yp|| < 40. 


Theorem 12.4.1. Under the stated conditions there exists an r, > 0 such that 
for |x| <7, 
'Yo(x) — Ὁ. (ΧἹ]} < Voll exp LK|x/]. (12.4.3) 


Proof. Suppose that the solutions exist for |x| <r, <a. Then 


Yi (x) = [90s Y, (s)] ds, Y,(x) = Yo + [0 Y,(s)] ds, 
0 0 
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and by familiar estimates for 0 < |x| <1, 


[| 
126) -- WOM < 150} + K |” |) — 3.0}} as 
This inequality is of type (12.3.10), and (12.4.3) then follows from (12.3.17). ἣ 


As the second application we take the problem of giving a priori estimates 
of the rate of growth of solutions for approach to a singular point of the 
equation. This method applies to linear equations and we shall consider first 
order matrix equations on the interval (0, δ]. 

Theorem 12.4.2. Let 

V(t)= «()ὑ(ἢ, Yd) - ὃ, (12.4.4) 


where A(t) €M,C(0, b) and becomes infinite as 1\0 in such a manner that there 
is a fixed c > 0 for which 


0 < sup [r°||A(t)||] = m < οο. (12.4.5) 


Then the behavior of \\¥(t)| as 110 is governed by the value of c as 
follows. If 


(1) 0<c <1, then ||Y(t)|| is bounded away from 0 and ©; 


(2) c =1, then | 
(t/by™ < YC) < G/t)": (12.4.6) 


(3) c >1, then there exist positive numbers A and B such that 


Aexp {- -- , frre < ||YO|] < Bexp [ate (12.4.7) 


Proof. For0O<a<B<6b 
Y(B) — Ya) = [ f(s) U(s) ds. (12.4.8) 


This gives β 
100} « 13] + [ 66} 190) ] a 


β 
« [{πῳ(.}}} + "] s [90}}} ds. (12.4.9) 
Gronwall’s Lemma applies to this inequality and gives 


B 
YB) < [(0}}} exp jm | ἜΣ 4. (12.4.10) 
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Here is where the three cases separate. The value of the exponential factor 
equals 


Case (1). exp fers [pi-¢ — ay}; (12.4.1) 
Case (2). (B/a)”; (12.4.12) 
Case (3). exp | 7 [at~* — pi} (12.4.13) 


Here we set « = t, B = ὃ and obtain the lower bounds for {{8({}}} stated in 
the theorem. 
To obtain the upper bounds we need the additional inequality 


β 
I 9 (α}}} « || YCB)|| exp | m | τα]. (12.4.14) 


There are various ways of proving this. The simplest is perhaps to make a 
change of variables. We set 


t=b—-u, Y(t) = L(y), (12.4.15) 
so that 
L’(u) = — A(b — μὴ) L(u), LO) = δ. (12.4.16) 


Treating this equation as (12.4.4) was treated, we get 


IL) || < Χ:(0}} exp | Ι oe 0)~* do 


Or 


Ye) < | YO)| exp | { eds| (12.4.17) 


Again separating cases, we obtain the upper bounds stated in the theorem. The 
details are left to the reader. ἢ 


In Case (1) a sharper statement may be proved, namely, that Y(t) tends to 
a limit #0 as 1.0. 
Cases (2) and (3) will arise if, for instance, t = z is a complex variable and 


A(z) Sz Py 2, (12.4.18) 
j=0 


where p iS a non-negative integer, 4), # Ὁ, and the matrix power series has a 
positive radius of convergence. Here p = 0 gives Case (2) and z = 0 is known as 
a regular singular point of the equation, while p > Ὁ leads to case (3) and an 
irregular singular point. 
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EXERCISE 12.4 


1. Make a comparison along the lines of Theorem 12.4.1 between the solutions of 
9 0) = FLx,YQ)] and 5’ 0) =S[x,3~0] 
with ‘Y(O) = 3(0) = Yo, if ¥ and G are bounded and continuous in some neighbor- 
hood of (0,0). Let ¥ satisfy a Lipschitz condition and ||¥ (x, 9) — S(x, Y)|| < ὃ 
for some fixed 6. 
2. Let A be a constant matrix. Then the equation 
W'(z) = Az *W(z) 
is of Case (1) at the origin. Solve the equation explicitly by a power series in z//* and 
verify that ‘W(z) has a limit as z > 0. 

3. Prove the general assertion that in Case (1) the solution of (12.4.4) tends to a limit 
when ¢ | 0. 

4. Construct examples of equations of type (12.4.4) where Cases (2) and (3) hold and 
explicit solutions are available to show that the estimates (12.4.6) and (12.4.7) are 
the best possible. 

5. Supply omitted details in the proof of Theorem 12.4.2. 


6. If Case (1) is present, show that the initial value problem W’(z) = A(z)W(z), 
‘W(0) = ὃ has a unique solution. 


12.5 REMARKS ON A CLASS OF MULTIPLICATIVE INEQUALITIES 


In the classical theory of analytic functions and of functional equations, functions 
with a multiplication theorem and functions satisfying a so-called g-difference 
equation have played a certain role. In the first case an equation of the form 


F(qz) = F[f(z)] (12.5.1) 


is considered where g is a given real or complex number #1 and Ε(ω) is a 
given analytic function of u, say holomorphic in some neighborhood of u = 0. 
The reader has encountered addition theorems in the theory of trigonometric func- 
tions, exponential functions, possibly also elliptic functions. These are relations 
of the form 


F(Z, + 22) = FLS (21), f (Z2)] (12.5.2) 


where normally F(u, v) is a symmetric algebraic function of the two arguments. 
Setting Ζι =z, =z, we obtain an equation of type (12.5.1) with g=2. By 
repeated use of (12.5.2) we can obtain other equations of type (12.5.1), where 
q is any preassigned positive integer. Specific examples are given by 


f (2z) = 21[7(2}15 -1, (12.5.3) 


_ 3g(z) -- [σ(2)7᾿ 


g(3z) = 1 39 ΠΣ (12.5.4) 
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satisfied by cosaz and tanaz, respectively. Complex multipliers occur in the 
theory of so-called complex multiplication for elliptic functions. 

To this class of functional equations corresponds a class of functional 
inequalities of the form 


fat) < FLS@I, (12.5.5) 


where now g and ¢ are real positive and the coefficients of the Maclaurin series 
of F(u) are also real. 

Another source of functional inequalities is offered by q-difference equations 
which in the linear case are of the form 


Σ ἀν. κ(2) ΚΙ φ"Σ] -- ὁ, ao(z) =1, (12.5.6) 
m=0 


where the a,(z) are given analytic functions and g #1 15 a complex number. 
More generally, the multipliers 1, g, g’, ---,q” may be replaced by arbitrary distinct 
complex numbers A,, A,,-::,4,. The corresponding functional inequalities will 
then be of the form 


f Unt) < = by x(t) f At), (12.5.7) 


where now 1 =A) <1, <4, < --- <A, and the 5,’s are given real-valued con- 
tinuous functions. 

Neither (12.5.5) nor (12.5.7) seems to have received much attention in the 
literature. The author is in no position to develop a theory of such inequalities. 
What follows is just some examples of such inequalities with rambling comments. 


Example 1. We start with the simple case 
7ὴ «7 ἢ, 0X4, (12.5.8) 


which is a special case of (12.5.5) as well as of (12.5.7). Any positive decreasing 
function f satisfies this inequality on R*. Thus a solution may increase 
arbitrarily fast as t decreases to 0. Not all solutions are of this form, however. 
Thus any real-valued periodic function of 


log t 


log 2 


of period 1 will satisfy (12.5.8), actually with equality. Such a function 
approaches no limit as ¢ tends to 0 or infinity. 


Example 2. We take Problem 10, Exercise 12.1, with ¢ replaced by 61, i.e. 
f (6t) < 4f Bt) + $f (22), (12.5.9) 


which is of type (12.5.7). This inequality turned out to be determinative in 
C*[0, 6], actually in C*[0, co], with the unique solution f = 0. This does not 
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mean that there are no positive solutions for t¢ R*, but they are unbounded as ἡ 
approaches the origin. Thus any power of ¢ of the form 


pe (12.5.10) 


will satisfy (12.5.9) provided p exceeds the unique positive root of the transcendental 
equation 


ham | lee —ad (12.5.11) 
which, to six decimal places, is 0.212127. Moreover, 
-- 1 (12.5.12) 


satisfies for 4 « 5. Since the inequality is linear in f, any linear combination of 
solutions with positive constant multipliers will also be a solution. 


Example 3. The inequality 
f(2t) < 2f(t) (12.5.13) 


is satisfied by any subadditive function. See Section 13.1 for properties of such 
functions. 


Example 4. Our last example is 
ft) <2[f@y) -1, (12.5.14) 
which is of type (12.5.5). If ais any real number, the functions 
| 1, cosat, coshat 


are solutions since they satisfy (12.5.14) with equality. More generally, for 


any A > 1, 

A, Acosat, Acoshat (12.5.15) 
are solutions. There are also polynomial solutions. A simple instance is that of 
the partial sums of the Maclaurin series of coshat. Thus 

n (at)** 
P,, (t) = -- -- 12.5.16 


k=0 


is always a solution. The verification is left to the reader. 
With these scanty indications we leave this class of functional inequalities. 


EXERCISE 12.5 
1. How are the inequalities (12.5.5) and (12.5.7) reduced to the canonical form 
f(t) < TLS] considered in this chapter? 
2. Verify the assertions made in Example 1. 


3. Same question for Example 2. 
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4. For te Κ΄ equation (12.5.9) has solutions of the form rt? and --, Find the 
limitations on p and 4. 


. Show that if a solution of (12.5.14) has a positive infimum m for te€ R*, then m > 1. 
. Similarly, verify that a finite supremum Μ΄ of a solution satisfies M 2 1. 

. Show that A exp (at), a > 0, is a solution of (12.5.14) for any A > 4. 

. Why are the polynomials (12.5.16) solutions? 


Oo Oo 4 NA ὧι 


. Are the partial sums of the Maclaurin series of cos at also solutions? 


COLLATERAL READING 


The reader may consult Section 1.5 of: 


HILLE, E., Lectures on Ordinary Differential Equations, Addison-Wesley, Reading, Mass. 
(1969). Sections 3.1 and 6.3 of this treatise have a bearing on Section 12.4 above. 


For differential and integral inequalities the basic work is: 


WALTER, W., Differential- und Integralungleichungen, Springer Tracts on Natural Philo- 
sophy, Vol. 2, Springer-Verlag, Berlin (1966). 


The original Gronwall Lemma occurs in: 


GRONWALL, T. H., Note on the derivative with respect to a parameter of the solutions of 
a system of differential equations, Ann. Math. (2) 20, (1918) 292-6. 


For antiderivations and relevant literature see: 


MILLER, J. B., Some properties of Baxter operators, Acta Math. Acad. Sci. Hung. 17, 
(1966) 387-400. 


13 FUNCTIONAL INEQUALITIES II 
FUNCTIONS ON PRODUCT SPACES 


Functional inequalities on product spaces play an important role in analysis. Two 
striking examples come to mind right away: subadditive functions and convex 
functions. Actually these are closely related. The first class is intimately connected 
with the concept of invariance under addition as well as with the notion of convexity. 
Functions which are piecewise convex or concave are, whether we like it or not, 
basic in calculus, elementary as well as advanced. Convexity is next to positivity 
the most fruitful concept in analysis, and its implications are legion. We hope to 
bring out some of these connections in the present chapter. Convexity is, of course, 
a geometrical concept, but ever since the days of Minkowski, who was the pioneer 
and pathfinder in this area, it has become the happy playground of the analyst. 

There are four sections: Subadditive and suboperative functions; Semi- 
modules and subadditive functions in R”; Convex functions; and Non- 
Archimedean valuations. 


13.1 SUBADDITIVE AND SUBOPERATIVE FUNCTIONS 


A general form of functional inequalities on product spaces is the following. Let o 
denote an associative, binary and commutative operation and let G(u, v) be a given 
symmetric function from R x R to R (more generally from R™ x R™ to R) and 
consider the inequality 


7 ο») < GL[f(x), ΟἿ]. (13.1.1) 


There are a number of interesting special cases. 
We shall start with 


ο-- +, G(u, v) =u +t v. (13.1.2) 
A function satisfying 


I(x +y¥) < f(x) +f) (13.1.3) 


is said to be subadditive. The domain of definition D = Df] of fis for the time 
being taken to be 4 subset of R = R*. D[f] must be closed under addition. The three 
most important cases are: D is Καὶ or Κ΄ or Z* (= the set of positive integers). To 
get interesting results in the first two cases, we must assume f to be measurable. 
As will be seen in Section 14.2, there are non-measurable additive functions, every 
one of which is trivially subadditive. 
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As simple examples of subadditive functions in R* we present 
x,O0<a<l, and —x',1<B<o. (13.1.4) 
The verification is left to the reader. 
Theorem 13.1.1. If D[f] = Κ΄, if f is measurable and subadditive and if f 


assumes the value +00 at most in a set of measure zero, then f is bounded above 
on every interval [a, B] where0<a< B< +0. 


Proof. Suppose the assertion is false and there exists an f with the stated 
properties which is not bounded in an interval [a, β]. We can then find a sequence 
of points {x,} such that (1) α <x, « β, (2) lim x, =X9 exists, (3)f(x,) > 2n. 


no 


We have then for 0 < s < x, 


2n < f(x,) = f(x, -- 5 -- 5) <f(x, -- 5) +f(s). (13.1.5) 
Thus either f(x, — s) > ἡ or f(s) > n and the measurable set 
S, = {t;n<f(t),0<t<x,} 
has measure 
m[S,] = 4x, > 4a. 


Now consider the set 
T, = {t;n <f(t),0<t< β). 


It is measurable and contains S,. Hence 


m[T,,| > 40. 


This holds for every n and T,, > T, 4; > (), T, = S. Since every inequality f(t) > ἢ 
holds in S, f(t) = +0 for all te δ. Here m[S] > 4a > 0 so we have a contra- 
diction. Thus for every measurable subadditive fand every interval J whose closure 
is in Κ΄ there is an Μ( ἢ < 00 such that f(x) < M(f,J) for xel. | 


Theorem 13.1.2. If in addition to the hypotheses of Theorem 13.1.1, f assumes 


the value — οὐ at most in a set of measure zero, then f is bounded in every interval 
I=[a,pl,0<a<B<o. 


Proof. Again suppose there exists an f for which the assertion does not hold. 
Such an fis bounded above on I. It is to be shown that fis also bounded below in J. 
If not, there is a sequence {s,} such that (1) a < 5, < β, (2) lim 's, = 59 exists, and 
(3) f(s,) < —n for each n. Then fora <s< B ala 


f(s + 5s,) < f(s) + f(s,) «- Μ — 14, (13.1.6) 


where M = M(f, J) is the least upper bound for fin I. This says that there is an 
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interval I, = [s, + a, s, + B] where 
7(ὴ «“«Μ -- ἢ, tel 
and the length of J, is B — «. Set 
PS (7. 


k2n 


ἢ 


Then {1,2} is a nested sequence of intervals, each of length at least Bp — a and 
f(t)< M—n for telI,*. It follows that all the inequalities are valid in (),, I,*; 
this is a measurable set of measure at least B — « > 0. This requires that f assumes 
the value — oo in a set of positive measure and this violates the assumptions. It 
follows that no f can satisfy the stated hypotheses and be unbounded below in J. 
Hence there exists a finite My(/, I) such that f(x) > M,(f, ἢ and fis bounded in J. " 


The restriction to an interval bounded away from 0 and infinity is essential. 
The function exp (1/x) is subadditive (why?) and continuous in R* but clearly not 
bounded above as x | 0. The second function under (13.1.4) suggests that a function 
subadditive on R* may tend arbitrarily fast to -- οὐ as x > +00. The reader will 
have a chance to prove this in Problem 2, Exercise 13.1. The first function under 
(13.1.4) shows that f(x) may tend to +00 with x, but here the rate of growth 
appears to be more moderate. In fact, we have 


Theorem 13.1.3. If f is measurable, finite-valued and subadditive in R*, then 


fire ep 


x>zao X x>o xX 


(13.1.7) 


Proof. The infimum in question is either a finite number b or — oo. We shall give 
the proof for the finite case. By the definition of the infimum, we can find a positive 
number a such that for a given, arbitrarily small, ¢ > 0 we have 


70) ἠδ 
a 


Suppose now that x = (n+1)a+cwhere0<c<a. Then 


T(x) f(nat+at+ec) nf(a+f(atec) 
ρ«:----- «πο τ΄ 
x (n+lhat+e (n+lhat+e 


na MAD 
“Gunga. Gao: 


As n — © the last member tends to (1 + ε) ὃ and (13.1.7) follows. The case where 
the infimum is — oo is left to the reader. ἃ 


I = [a, 2a]. 


If DL f] = R instead, the corresponding results are 


Theorem 13.1.4. If f is measurable, finite-valued and subadditive in R, then f is 
bounded on compact subsets of R. 
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Theorem 13.1.5. Under the same assumptions 


f(x). 70] 
——— = inf —— 


lim =b< oo, (13.1.8) 
x+>to αΧ O<x xX 
lim DOD ip IO = as —o. (13.1.9) 
x7 ποὺ xX x<O XxX 
Herea< ὃ. 
The proofs are left to the reader. 
From f(2x) < 2/(x) it follows that 
πὶ sup) = ne pup) <2 ne eae (13.1.10) 


Hence, if πὴὶ sup ΐ ee is finite, it is sosmaelle non-negative. This is also true for 

lim inf f(x). “It fis defined and subadditive in R, there are similar inequalities for 
x|O 

approach to x = 0 from the left. A particularly interesting case 15 that where 


lim f(x) = 0 (13.1.11) 


x70 


either for approach from the right or for two-sides approach. 


Theorem 13.1.6. If f is measurable, finite-valued and subadditive in R* and if 
(13.1.11) holds, then for x > 0 


lim sup f(x + h) < f(x) < lim inf f(x — A). (13.1.12) 
Ἀ10 h}o 
If the same assumptions are valid in R, then f is continuous in R. 


Proof. Here (13.1.12) follows from the inequalities 


7 - ἢὴ «70) τ (ά)γ, 10.) «ὺ -- ἢ) τ ΓΟ. 
If (13.1.11) holds for two-sided approach, then ἢ can be replaced by -- ἰ in both 
inequalities to obtain 


lim sup f(x — ἢ) < f(x) < liminf f(x + ἢ). 
hj 0 h1O 


This combined with (13.1.12) gives continuity. ἢ 


Moduli of continuity [see (3.2.11) and Problem 5 of Exercise 3.2] are examples 
of subadditive functions. So are Minkowski functionals or gauge functions [see 
(1.6.18) and following]. Subadditive functions also play a basic role in the theory of 
semi-groups of linear operators. 

C. T. Ionescu Tuléea (1960) generalized the inequality (13.1.3) in various direc- 
tions. One of his inequalities 15 


f(x + y) < g(x) + Ay), (13.1.13) 
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which has important bearings on semi-group theory. Among other results he 
proved a generalization of Theorem 13.1.1. 


Theorem 13.1.7. If g and ἢ! assume the value +0 at most on sets of zero 
measure and if either g or h is measurable, then f is bounded above on compact 
sets. 


Proof. To fix the ideas, suppose that (13.1.13) holds for x and y in R* and that g 
is measurable. Then f can take the value - οὐ at most on a set of measure zero. 
Suppose now that fis not bounded above in [«, 8]. There is then a sequence of 
points {x,} in this interval where f(x,) >2n. Let x, satisfy 0 < xo < infx,. 
We have now 
g(s) + h(x, — 5), 
2n < f(x,) < O<s<~x,. 
σίχ, — 5) + As), 
Set 
S, = {sn < g(s),0<s < xo}, 
(13.1.14) 
T, = {s;n > g(s),0<s < Xo}. 


These are disjoint measurable sets so that 


mLS,,] + m[T,,] = χορ. 
In T, we have 
h(x, — 5) Σ ἢ. 
Next note that 
δὲ > Sati = ()S, Ξ 8. 
k 


If m[S] τὸ 0, then g(s) > n for all n and all se δ. This requires that g(s) = +00 
in S, a set of positive measure, and this contradicts the assumptions. 

Suppose now that S is of measure zero. This implies that {m[S,]} is a non- 
increasing sequence of limit 0. Then {m[T,,]} is a non-decreasing sequence with 
limit xo. In the sequence {x,} we can find a sub-sequence {x,,} which converges to 
a limit x* > x9. Set 

U S55 =a, 1, 7€ 1}. (13.1.15) 


The sets U,, are confined to the interval [x* — x9 — 8, x* + 87 for all large values 
of k. These sets are measurable and m[U,,] increases to xo. Further, h(s) > n, for 
sin U,,. Set 

ἘΞ Urge (13.1.16) 

psk 

Then m[V,] > xo for each p and h(s) >n, for all se V,. Here Vo Vea 
(\)V,2#M. It follows that h(s) = +00 in a set of positive measure and this is 
again a contradiction. The reader should note that our discussion of sets in which 
h(s) is large does not involve a tacit assumption that is measurable. Thus it is seen 
that f must be bounded on compact sets. Jj 
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There is an interesting generalization of the subadditive inequality to the case 


fxoy) < f(x) + f(y), (13.1.17) 


where again o denotes an associative, binary, commutative operation. A solution 
of such an equation is said to be suboperative (with respect to the operation 0). 
E. Hille and 5. Phillips (1957) have considered special cases of such inequalities and 
given some discussion of the general case. The latter was treated in detail by 
C. T. Ionescu Tuléea in 1960. Tuléea also considered the generalization 


f(xoy) < g(x) + κυ), (13.1.18) 


which ‘s important in the theory of Lie semi-groups of linear operators in the 
strong topology. 

Subadditive functions on R” x Κα to R have been studied by R. A. Rosenbaum 
(1950). Some related results are discussed in the next section. 

Actually subadditivity extends to more general situations. In considering the 
inequality (13.1.3) it is necessary to specify domain and range of ΓΚ The domain 
D[f] in an additive space must be closed under addition and the range R[ /] 
must belong to an additive partially ordered space. There is no lack of such sets D 
and 91 and each choice gives rise to a theory of subadditive functions. A particular 
extension may be trivial but should not be condemned without a fair trial. 


EXERCISE 13.1 


1. Prove that a positive non-increasing function on R* is subadditive. What con- 
sequences will this have for the question of growth properties of a subadditive 
function? 


2. If g is positive and non-decreasing on R*, prove that —g(x) is subadditive on Ε΄. 
Growth properties? 


. Prove (13.1.7) if the infimum is — οὐ. 

. Prove Theorem 13.1.4. 

. Prove Theorem 13.1.5. 

. Verify that x“, 0 < k <1, is subadditive in R*. How about |x|* in R? 


. Verify that |sin x| is subadditive in R. 


oN KN A kh ὦ) 


. Prove the analogue of Theorem 13.2.1 for functions subadditive in an interval 
(a, 00) where 0 < a. The method used above applies to an interval [a, β] with 
2a «α. 


9. Extend Theorem 13.1.3 under similar assumptions. What infimum enters in the limit 
relation? 


10. Formulate and prove analogues of Theorem 13.1.3 for Z”, 1.6. if {a,} is a sequence 


such that ay,4, <a, + a, for all positive integers, what is lim a,/n? 
n-> 0 
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11. Does an analogue of Theorem 13.1.1 hold for the suboperative inequality (13.1.17) 
when o =: and D[/f] = R*? 


12. What is the analogue of Theorem 13.1.3 for this case? 


13. Suppose that xo y = (x + y)(1 + xy)~‘ and consider the corresponding inequality 
(13.1.17). Show that the operation has the required properties. Show the existence 
of a function g such that f[g(t)] is subadditive. Find analogues of Theorems 13.1.1 
and 13.1.3. 


13.2 SEMI-MODULES AND SUBADDITIVE FUNCTIONS IN A” 


We recall that the domain of definition of a subadditive function f must be invariant 
under addition. This motivates the following 


Definition 13.2.1. A semi-module S is an addition-invariant subset of a vector 
space X such that (1) X is a commutative group under addition satisfying 
Postulates (A,) to (A;) of Definition 2.1.1, (2) ¥ is a Hausdorff space in which 
the operations (x,y) > x + y and x > —x are continuous. 


Thus if x, ye S, so does x + y. Hence any semi-module S can be the domain 
of definition of a subadditive function and, vice versa, the domain of definition of a 
subadditive function is a semi-module. 

We start by deriving some properties of semi-modules to be used in the 
following. The first is a lemma the proof of which is left to the reader. 


Lemma 13.2.1. Ὁ < X is a semi-module, so is the closure of S, as well as the 
interior of ©. 


The zero element plays a peculiar role in our theory, as is shown already on the 
line. The following class of semi-modules is particularly important. 


Definition 13.2.2. An angular semi-module is one where (1) the zero element 
belongs to S and (2) S is open. 


The reason for the name will become clear later. On the line there are only 
three angular semi-modules, namely, Κ΄, Κ΄, and R itself. Already in the plane 
there is a continuum of possibilities, and the angular character becomes evident. 
In R”, m > 1, there is a peculiar alternating role of semi-modules and subadditive 
functions. To define a subadditive function on R”™ we need an angular semi-module, 
but the boundary of the latter is defined by a subadditive function on R™~* and so 
on until we get down to Κ΄. We shall get a glimpse of this interplay in the following. 
The next result still holds for an additive Hausdorff group X. 


Lemma 13.2.2. If S < X is an angular semi-module, then S = Int (GC). 


Proof. Since S is open by assumption, ὦ = Int (S) < Int (Ὁ). To show that the 
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inclusion is actually an equality, we have to prove that no point x of the complement 
of S can be in Int (S). Suppose that x ¢ ὦ and let N, be a neighborhood of x, 
arbitrarily small. It is enough to show that N, contains an open set G such that 
GOCG=89M. To this end, take a neighborhood N, of the origin so small that 
x — Ny cN,. By assumption (1) of Definition 13.2.2, the zero element of ¥ 
belongs to S. Since S is open, we can find an element y which together with a full 
neighborhood N, belongs to S ὦ No. Then G = x — N, is an open set belonging 
to N,. No point u of G can belong to G, for if it did, then x = (x — u) + u would 
belong to S since x — wu already has this property (why?). It follows that 
x ¢ Int (S). ff | 


Corollary. Under the same assumptions $ +  « ὦ. 


Proof. Let χε, yeS. Since S is open, there is a neighborhood N, of x 
also in ὦ and hence in G. This says that the open set y + N, < © and hence 
it belongs to Int(S) = 6. Jf 


From this point on we specialize and take ἃ = Καὶ 


Lemma 13.2.3. If S is a semi-module in R™ and if every neighborhood Nog of 0 
contains an element of © distinct from 0, then there exists at least one non-zero 
vector Ὁ such that pbe S for all p > 0. 


Proof. The assumptions imply the existence of a sequence {x,} < S with x, # 0 
and lim x, = 0. The sequence of unit vectors {x, ||x;,||~*} admits of at least one 


k> a 
vector which is a cluster point of the sequence and without loss of generality we may 
assume that the whole sequence converges to a limit Ὁ with ||b|| =1. It is to be 
shown that all vectors pb, 0 < p, are in ὦ. To this end, set n, = [p||x,||~1] +1 
where [uv] is the largest integer less than or equal to u. Now {n,x;,} is a sequence of 
vectors in ©, for if x € S, so does nx for any natural number nv. Here 


n,X;, = {[p||x,l]~*] + 1} x, 


with the limit pb as k > oo. It follows that pbe S. Jj 


Corollary. The conclusion of Lemma 13.2.3 holds for any angular semi-module 
in Καὶ 


It may very well happen that there is one and only one vector b with the stated 
properties. This is illustrated by 


Example 1. Take the plane R? and define ὦ as the set of points (x,y) such that 
0 < x, 0 < y < x’. This is seen to be an angular semi-module and the only unit 
vector b such that pb ε S is b = (1, 0) and this vector is in S but not in ὦ. There is 
no vector in © with the stated properties. 
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Lemma 13.2.4. If S is an angular semi-module in R™, then there is at least one 
unit vector Ὁ such that x € S implies x + phe S, 0 < p. 


Proof. We take b as in the preceding lemma and combine with the corollary of 
Lemma 13.2.2. Here xe Ὁ, pbe G, so thatx + pheS + Ec Ὁ Π 


This says that an angular semi-module Ὁ in Κ΄ is the union of a system of 
parallel line segments extending to infinity in one direction. The problem is now to 
characterize the other endpoints. What is the greatest lower bound for p in order 
that x + pb shall belong to ©? In Example 1 there is a simple answer to this 
question: for a given y > 0 we must have x > γῆ. It is no accident that f(y) = y? 
is actually a subadditive function of y for y > 0. We shall carry out the character- 
ization of the boundary of angular semi-modules in R*, where it is easy to visualize 
what is going on. The reader will find it instructive to carry out the analogous 
description in Κ΄. The basis for the discussion is the observation that the projection 
of an angular semi-module in R™ on the hyperplane perpendicular to the 
distinguished vector b is an angular semi-module in R™~!. To this new semi- 
module the same observation applies, which gives the step-by-step reduction. 

We now take an angular semi-module in R* represented by the complex plane. 
Since a rotation about the origin does not change the character of S being a semi- 
module, we may assume that the distinguished vector b is (1,0). Then the points 
of Ὁ are of the form z = x + iy and if z) € S, so does zy + p,0 < p. We denote the 
projection of © on the imaginary axis by $B, so that 


P= {y;xt+ ives}. (13.2.1) 
Lemma 13.2.5. Ὃ is an angular semi-module on R. 


Proof. Let z;¢ ©, z, € Ὁ where z; = x, + iy, Z2 = X, + iy,. This implies that 
y; and y, belong to $ as well as their sum γι + γὼ. Since © is open, so is δ, and 
since 0 ε S we have 0 ε and this proves the assertion. Jj 


There are only three possibilities for δ, namely R, Ε΄, and Κ΄. We can 
discard the last case since it is symmetric to R* and does not give rise to a different 
theory. We now define 


f(y) = inf {x; x + ive S}. (13.2.2) 


Theorem 13.2.1. f{(y) is a subadditive function of y on %, upper semi-continuous 
and with lim inf f(y) = 0 or —oo. In the latter case f(y) = —@ inf. 


yO 


Proof. Suppose y, €$, y, ¢%. By the property of an infimum, for every 6 > 0 
there is an x,, with x, + iy, Ε Ὁ, such that x, <f(y,) + 6. There is also an xp, 
with x, + iy, € Ὁ, such that x. </f(y2) + 6. Set z, = x, + ty1, 22 = Χμ + iyp. 
By the continuity of the mapping (z,, Ζ2) > z, + Z2, for every neighborhood N of 
Z, + Z, there are neighborhoods N, of z, and N, of z, such that N,; ἘΝ; CN 
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where N, = {x + iy; |x — x,| < 6, |y — γι} < δὲ and Ν; = {x + iy; |x — χη < ὃ, 
ly — γι] < δ). From x, <f(y1) + 6 and x, <f(y2) + ὃ we infer that 


fOr + Yo) <Xy Ὁ χ <f (1) + f(y2) + 26. 


Here 6 can be arbitrarily small so that. 


7(σι + Y2) «ὦ + £2); 


which is the subadditive property. 

To prove the upper semi-continuity of f, let γ, ε ἢ. By the property of the 
infimum, for every 56> 0 there is an x, with z; =x, +iy,¢© such that 
x, < f(y) + 46. Since S is open, there is a neighborhood N, of z, such that 
N,<S. Now x, </f(y;) + 46 implies f(y) < x < x, +46 <f(y,) + 6 for 
every 
| x+ivyeN, = {x + iy; |x — x,| < $6, |y — yi| < 40}. 


This is the upper semi-continuity. Such a function, in particular, 15 measurable and 
the set where f(y) = +00 is void. Finally, since 0 ε ©, the inequality 


f(2y) < 2f(y) 


implies 
lim inf f(2y) = lim inf f(y) < 2 lim inf f(y) 
γ50 yO »»Ὸ 


with the two alternatives lim inf f(y) = 0 or —oo. In the second case 
yO 
lim inf f(y + h) < f(y) + im inf f(A) = — 00 
h>0 h-0 


for all ye $B. Β 


The second alternative cannot be excluded. Thus if ὦ = C, the whole complex 
plane, f(y) = —oo for all ye R. Similarly for Ὁ equal to the upper or the lower 
half-plane where f(y) = —0o in R* or in Κ΄, respectively. 


Lemma 13.2.6. An angular semi-module in R? is a simply connected point set. 
It is either the whole plane or a subset of a half-plane. 


Proof. We can join two points z, and z, in © by a broken line [z,, 23, 24, Z,] 
where z3; = x3 + iyy, Z, = Χ3 + iv2, and x, exceeds the finite least upper bound 
of f(y) in the interval [y,, y.]. If $= Κ΄ or Κ΄, then Φ is confined to a half- 
plane, the upper or the lower, respectively. If $8 = R, then the positive real axis 
belongs to S. Suppose now that there are vectors z, and z, in S where arg z, = 6, 
and —2<0,<0<6,<72. Then by the additivity all points z = re® with 
06,—«<60< 06, + « and large values of r are in S. If now 0, — 6, > 7, then all 
the distant points in the complementary sector would also be in ὦ and this can be 
true iff S is the whole plane. It follows that if S # C, then S will occupy an angle 
at the origin of opening at most π. Jj 
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Similarly, we see that an angular semi-module in Κ΄, m > 2, is-either R” itself 
or is confined to a half-space. If vy denotes a vector in R™ and Ὁ is a semi-module in 
R™, we shall consider a subadditive function F(v) defined in S. Thus 


F(v, + v,) < F(v,) + F(v,) (1393) 


for all v,, v, in S. F shall be finite-valued and measurable. Here measurability is 
taken in a rather sweeping sense and is meant to include k-dimensional 
measurability of the restrictions of F to k-dimensional manifolds for all k < m. 
In particular, we shall need radial measurability. 


Definition 13.2.3. A function F from R™ to R* defined in an angular semi-module 
Ὁ is said to be radially measurable if for each fixed v € Ὁ the function F(rv) is a 
measurable function of r for all large values of r. 


Since ὦ is open and ve GS by assumption, then rve Ὁ for all values of r ina 
sequence of intervals 


[ΜΈ -- 5),n(+6)], n=1,2,3,... 


for a small value of 6. These intervals ultimately overlap. Thus there exists for 
each ve ὦ a number a(v) = 0 such that F(rv) is defined for all r > a(v). 


Theorem 13.2.2. Let F be defined, finite-valued, radially measurable and 
᾿ς subadditive in an angular semi-module GS € R™. Then 


F 
lim — = G(v) (13.2.4) 
exists for allyée S. Moreover, 


F(r vy) 


—o < G(v) = inf sve SG, r>2a(v)}. (13.2.5) 


Proof. For fixed v € S the function F(rvy) is defined for r > a(v) where a(v) is the 
infimum of the numbers a such that rve Ὁ for all r > a. Further, F(rv) is a sub- 
additive function of r for r > a(v). Now Theorem 13.2.3 may be extended to func- 
tions f(r) which are subadditive on an interval (a, 00). Again the limit of f(r)/r 
exists and equals the infimum of the ratio for r>2a. See Problem 9, 
Exercise 13.1. | | 


To illustrate we shall work out some examples in R*, again represented by the 
complex plane. As we shall see, it is enough to work out the results for unit 
vectors. We set 


G(e’®) = σ(θ). (13.2.6) 
Example 2. Take © as the right half-plane, x > 0, and set 


F(z) = F(x + iy) = [7γὶ} + x. (13.2.7) 
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This is the sum of an additive and a subadditive function, each of one variable, so 
the sum is subadditive in z. Here 


g(0) = cos@. 
Example 3. Again take © as the right half-plane and set 
F(x + iy) = γχ ΄. (13.2.8) 
The subadditivity follows from the identity 


X2(x, + XV + X4(X, + X2) 2" - χιχγχίγι + Y2)° = (X12 — X2V1)" > 0. 


Here 
g(0) = sin70 sec 8, [0] < 4n. (13.2.9) 


If S = C, then g(@) is defined for all 6. In the general case there exists an angle 
of opening <z in which g is defined. This includes the possibility that g is 
identically — οὐ in part or all of the sector. The following example illustrates this 
phenomenon. 


Example 4. Take © to be the right half-plane and set 
F(x + iy) = —x? = —r?’ cos’6, [0] < An, (13.2.10) 


with g(@¢) = —o, for all @. 

Following a terminology introduced by G. Polya (1929) for functions typified 
by g(@), we shall refer to the function G defined by (13.2.4) as the radial growth 
indicator or simply as the indicator of the subadditive function F defined on the 
semi-module ©. Such indicators were first introduced by E. Phragmén (1863-1937) 
and Lindeléf in their epoch-making memoir on the principle of the maximum of 
1908. If f 15 (Cauchy) holomorphic in a sector GS, say |argz| < ἃ, and is of 
exponential growth in G, Le. 


If(Z)| « Mexp (Biz|), Ζεῦ, 
then 


1 
lim sup — log | f(re”®)| = g(0), lO] < a, (13.2.11) 


exists and — oo < g(@) < B. Phragmén and Lindeldf studied the properties of this 
function and observed its convexity properties. A more detailed study was made in 
1929 by Polya, who was able to utilize the advances made in the theory of convex 
domains and convex functions in the meantime. Polya based his study on a three- 
membered functional inequality satisfied by g, our formula (13.2.23) below. 

So much for the early history. The indicator G has a number of interesting 
properties. We expect it to have connections with convexity and our expectations 
will be amply fulfilled. The basic properties follow more or less directly from the 
definition. 


392 FUNCTIONAL INEQUALITIES Il. FUNCTIONS ON PRODUCT SPACES 13.2 


Theorem 13.2.3. G is positive-homogeneous, i.e. 
G(av) = aG(v), veG, 0 «αἀ. (13.2.12) 
Proof by inspection. Jj 
Actually we see that the domain of definition of G is not S but C[G], the least 
cone with vertex at the origin which contains G, in other words, the convex hull of G. 


Thus, while © is not necessarily a convex domain in R”, C[S] has this property. 
The subadditivity of F in Ὁ implies subadditivity of G in C[G]. 


Theorem 13.2.4. If ¥;, ν2 € CLS], then 
G(v, + V2) < G(v,) + G(v,). (13.2.13) 


Proof. For large values of r the vectors rv, and rv, Ε S if v,, v, ε C[S]. Hence 
F[r(v, + v2)] < F(rv,) + F(rv,). 
Divide by r and pass to the limit to obtain (13.2.13). Jj 
Combination of (13.2.12) and (13.2.13) yields convexity. 


Theorem 13.2.5. If X;, x, € C[G], so does 4(x, + x,) and 
G[4(x, + X2)] < [σ(χ!) + G(x,)]. (13.2.14) 


Proof. The first assertion follows from the convexity of C[G], the second from 
G[4(x,; + x2)] < GGGx,) + Gx.) = 4$G(x,) + 4G(x,). β 


For the properties of convex functions on R”™ to be used in the following, see 
the end of the next section. It is shown there that (13.2.14) implies 


G[a,X, + aX, + τ: + a,x,] < a,G(x,) + a,G(x,)  --- + a,G(x,) (13.2.15) 


for any choice of points x,, X2, ..., X, in the domain of definition of G, C[S] in our 
case, and for any choice of non-negative weights a; of sum unity. Actually in 
Section 13.3 the argument is given only for weights which are rational numbers. 
In general, irrational weights require continuity of G, a property which we want to 
prove rather than assume. But in the present case (13.2.15) holds for arbitrary real 
non-negative weights. We need merely to use subadditivity and imitate the 
derivation of (13.2.14) from (13.2.13). Thus we have 


Corollary of Theorem 13.2.5. Formula (13.2.15) holds for any choice of points 
in CL©] and any non-negative weights of sum unity. 


We have seen that an indicator G can take on the value —oo. Formula 
(13.2.15) throws further light on this situation. Take n = m, the dimension of the 
space and of © and C[G], and choose m linearly independent vectors {χ;} in CLG]. 
If G(x;) = —o for some ), then the inequality shows that G(x) = —oo for all 
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vectors x which are linear combinations of the basis vectors with positive weights 
adding up to unity. This is an (m — 1)-dimensional variety, Bp say. But the 
conclusion also holds for any vector y = ax such that x ε Bo, 0 < a, and this is a 
cone V, a conical subset of C[S]. By varying the basis vectors we extend the 

conclusion to all of C[G]. Hence we have | 


Theorem 13.2.6. G takes on the value —© either everywhere in CLS] or 
nowhere. 


Ultimately this result is based on Theorem 13.1.3, which states that for a 
measurable finite-valued function, subadditive on Κ΄, the ratio f(r)/r tends to a 
limit when r becomes infinite and this limit may be —oo. This possibility is 
excluded if f is subadditive on R. Consequently, if F is defined, finite-valued, 
measurable (including radially measurable) and subadditive on all of R”, then for x 
in a suitable half-space 


—0oo « G(—x) < G(x) < ~. (13.2.16) 


G is bounded above in C[G] in the following sense: 


Theorem 13.2.7. If G is finite-valued in C[G], then for each proper sub-cone V 
there is a number M|N] such that 


G(x) < M[V] |x|, VxeV. (13.2.17) 


Proof. As above, choose a basis {u,} of vectors belonging to C[G]. This time each 
u, shall be a unit vector. The (mm — 1)-dimensional polytope By now consists of all 
vectors of the form 


x=) au, a 20, Ya = 1, (13.2.18) 
j=l j=l 
and m 
G(x) < Σ᾽ a,G(u;) < max G(u)). (13.2.19) 
j=l j 


Here the distance of %, from the origin is a positive number d which depends upon 
the geometrical configuration and is, in any case, <1. If V is the cone swept out 
by B, we have, for any xe V, 


G(x) <d7! | max σῳρ) [χ|. (13.2.20) 


Thus we can take M[V] = d~/ max G(u)). ἢ 


An adaptation of the technique used in proving Theorem 13.2.2 could be used 
to prove that G is also bounded below on V, but convexity gives a much stronger 
result. 
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Theorem 13.2.8. A finite-valued indicator G defined on the cone ([] in R™ 
is continuous on any closed proper sub-cone VV and uniformly continuous on 
compact subsets of V. 


Proof. We appeal to Theorem 13.3.7 of the next section, where it 1s proved that a 
function which is convex, measurable and finite-valued in a convex domain D is 
continuous in D. These properties hold for a finite-valued indicator. A 


Suppose again that {u,} is a set of m linearly independent vectors, all in C[S]. 
Let 8, and V be defined as above. Then the vectors in V are linear combinations of 
the basis vectors with non-negative constant multipliers. By Cramer’s rule we can 
write such a combination as follows: 


j=1 


Here the coefficients are certain determinants. If 
υ; =a (Uj1, Ujr5 «++» Ujm)s x = (x1, σα en) 


then A = det (u;,) and A,(x) is the determinant obtained by replacing the jth row 
of A by x,, Xz, ..., Xm Here A 4 0 and we can choose the numbering of the basis 
so that A> 0. For xe V the A,(x) are non-negative. The basic inequality now 
becomes 


AG(x) < ἊΣ Δι (Χ) G(u,). (13.2.22) 


The special case m = 2 is in the literature. We represent R* by the complex 
plane and suppose that C[G] is the sector a < argz < B. We set G(e’®) = g(6). 
Now choose three angles 0, 6,, 8, such that 


a<0,<0<0, <8. 
It is assumed that S is not the whole plane, so that B —a <2. Then 
u, = (cos@,, sin@,), u, = (cos9,, sin@,), 
while x is taken to be (cos@, sin@). This gives 
A = sin (@, — 0,), A, = sin (6, — 9), A, = sin (8 — 9,), 
and (13.2.22) becomes 
sin (8, — Θ.) 4(0) < sin (6, — 0) g(8,) + sin(@ — 0,)g(0,). (43.2.23) 


This is the functional inequality studied by Polya in 1929. It governs the radial 
growth indicator of log | f(re’’)| if fis holomorphic and of exponential growth in a 
sector. We have now seen that it holds for the radial indicator of a function sub- 
additive in an angular semi-module. Further, and this is the oldest result, it is 
satisfied by the function of support of a convex domain in the plane. This was first 
pointed out by the German mathematician Wilhelm Blaschke (1885-1962) in 1914. 
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Blaschke was an eminent differential geometer who also did important work on 
convexity and analytic function theory. 
The case m = 3 is also of some interest. Here the coefficients in (13.2.22) are 


known in classical vector analysis as box products and the formula would be 
written 


[u,u,u;] G(u) < [uu,u,] G(u,) + [uuu] G(u,) + [u,u,u] Guy). (13.2.24) 


The box product is obviously a scalar. Expressed in terms of the traditional cross 
and dot products, we have [uvw] = (u x v): w. 
The functional equation 


H(x) = Σ᾽ a,H(u), x = Σ᾽ ἀπι; (13.2.25) 
joi =1 
is satisfied by any linear functional on Κ΄, i.e. we can take 
H(x) = > b;x;, if x= (x1, X45 tees Xm) (13.2.26) 
j=1 


where the b’s are arbitrary real numbers. Cf. Problem 7, Exercise 1.2, where it is 
stated that any linear functional of C” is an inner product. The same obviously 
holds in Κα, 

This fact has a bearing on the family of solutions of the functional inequality 
(13.2.25). It shows that if G is a member of the family, so is G + H for any H of the 
form (13.2.26). This fact:may be used to improve the local properties of an indicator 
function. For (13.2.23) the corresponding “‘sinusoid function”’ 


h(@) = a, cos@ + 6, sin@ (13.2.27) 


has been used for such purposes. 

The time has come to elucidate the connection between indicators G(x) on the 
one hand and convex solids on the other. As a matter of fact, the connection is 
twofold. Given a convex solid K in R™ and a pole P, ΡῈ K, we can construct two 
distinct solutions of the inequality (13.2.15) each of which describes K uniquely but 
in different language. One of these solutions is the gauge function of K in the sense 
of Minkowski, the other its function of support in the sense of Blaschke—Minkowski. 
The first of these gives the equation of 0K, the boundary of K, in terms of point 
coordinates with P as the origin. Here 


p(x) =1 (13.2.28) 


is the equation of 0K where p(x) is the gauge function of K which satisfies (13.2.15). 
The other describes K in terms of “tangential coordinates’. If u is a unit vector 
from P and h(u) is the (oriented) length of the foot point perpendicular from the 
pole to the tangent plane of K with normal u, then K comes out as enveloped by the 
plane bundle 


(x, u) = A(u). (13.2.29) 
Here h(u), the function of support of K, also satisfies (13.2.15). 
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Conversely, given a finite-valued measurable solution G of (13.2.15), we can 
construct a convex solid K having G as its gauge function and another convex 
solid K having G as its function of support, in both cases with respect to a pole 
which is taken as the origin of coordinates. These two solids are said to be polar 
to each other. The language used here (poles, polar, point coordinates, tangential 
coordinates) goes back to the German mathematician and experimental physicist 
Julius Plicker (1801-68), creator in both the fields of mathematics and of physics 
(algebraic geometry, line geometry, and spectral analysis). 

Gauge functions occurred already in Section 1.2 and played a basic role in the 
discussion of linear functionals in Section 10.3. We recall the definition and basic 
properties of such functions, here specialized to Euclidean space R”. 


Definition 13.2.4. Let K be a convex solid in R™ containing the origin as an 
interior point. The gauge function p(x) of K is defined as 


p(x) = sup {a; x € R”, axe K}. (13.2.30) 


Theorem 13.2.9. The gauge function has the following properties: (1) p(0) = 0, 
(2) p(ax) = ap(x), 0 < a, Vx, (3) p(x + y) < p(x) + ply), (4) 


{x; p(x) <1} « Kc {x; p(x) < 1}. (13.2.31) 


The proof is left to the reader. Cf. Theorem 10.3.1. 

For an ellipsoid with center at the origin and semi-axes a, b, c, the gauge func- 
tion is simply the square root of the left member of the point equation of the 
ellipsoid, i.e. 


Definition 13.2.5. Let K be a convex solid in R™ containing the origin as an 
interior point. Consider the family of parallel hyperplanes 


H(u, δ): (x, u) = δ, fixed unit vector, 0 « b. (13.2.32) 
Here x € R™ and (x, u) is the usual inner product. Set 
h(u) = sup {b; Kn Hu, δ) ζ B}. (13.2.33) 


Then ἢ is known as the function of support of K, with value h(u) in the direction u, 
while 
(x, u) = A(u) (13.2.34) 


is the corresponding plane of support. 
Here the basic results are given by 


Theorem 13.2.10. If K is a bounded closed convex solid containing the origin in 
its interior, then h satisfies 


h(ayu, + aga, + +> + a,M,,) < ayh(u,) + a,h(u,) + --- +. a,h(u,). (13.2.35) 
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Here {u;} is any basis of unit vectors for R™ confined to an arbitrary half-space and 
the a; are any non-negative numbers. Further, h is bounded and continuous 
If K is unbounded but confined to a half-space, say x, <a, let Cy be the cone 
formed by the rays for which hu) < + co. Let the basis {u;} be confined to Co. 
Then (13.2.35) holds for any non-negative weights and h is bounded and continuous on 
any compact subset of Co. 


Proof. From (13.2.34) it follows that we can extend the definition of from unit 
vectors to arbitrary vectors by setting v = au and h(v) = ah(u), 0 < a. This ensures 
positive homogeneity. We know from the discussion of indicators G that in order 
to prove (13.2.35) it is enough to show that h(v) is convex. This follows from the 
convexity of K. For we have by (13.2.33) and the extension of the definition of h, 


h(v) = sup (x, v), xe K, (13.2.36) 
so that 
(x, v) < A(y), VxeK. (13.2.37) 


If, now, V,, V2 are two vectors for which h(v,) and h(v,) are finite, then for all χε K 


(x,v,) <A(V,), (ἃ, V2) < ACV), 
and for0 <a «]1 


a(x, V1) + (1 — a) (x, v2) < ah(¥,) + (1 — α) Aya). 
By the additivity of the inner product the left member equals 


[x,av; + (1 — a) vp], 
and if the inequality 


(x, av, + (1 — a) v2) < ah(v,) + (1 — α) A(v2) 
holds for all χε K, it also holds for the supremum, so that 
hlav, + ( — a) v2] < ah(v,) + (1 — α) A(y,). (13.2.38) 


This is the basis for (13.2.35). 

The argument also shows that the vectors v for which h(y) is finite is a convex 
set. This may be the whole space as in the case of a bounded convex set K. If this 
is not the case, then the vectors for which /A(v) is finite-valued form a convex cone 
C, confined to a half-space. This is the case when K is unbounded. 

In the first case ἢ is bounded and hence continuous by Theorem 13.3.7 below. 
In the second case the same conclusion holds on any compact subset of Co. Ε 


Before taking up the converse problem—given an indicator G, find K and K— 
we shall illustrate by a concrete problem. 


Example 5. We take m = 2, R* = C and the function 
F(z) = F(x + iy) = ax +4y?x', x > 0. (13.2.39) 


Here a is a fixed positive number. The function F is closely related to that of 
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formula (13.2.8). Thus the reader will know that F is subadditive in the right half- 
plane. Moreover, it is its own indicator, so that G(z) = F(z). 
The first problem is to interpret G as a gauge function. We have clearly 


p(z) = G(z) = ax + τ y?x7! (13.2.40) 


and the closed convex region K with this function as its gauge measure is the 
interior and boundary of the ellipse 


4ax* — 4x + y* =0 (13.2.41) 
which passes through the origin. 
To get the polar region, we set z = οἷ and get 
h(e'®) = g(0) = acos6@ + }sin76 sec. (13.2.42) 
It is a simple matter to show that the straight lines 
xcos@ + ysin@ = acos@ + }sin’0 5600 (13.2.43) 


for [0] < ἐπ have the parabola 
yy =a-x (13.2.44) 


as their envelope. The interior and boundary of the parabola is the unbounded 
convex polar region K. 
We have now 


Theorem 13.2.11. Let G be a finite-valued continuous function, defined convex 
and positive-homogeneous in some open cone Cy of R™. Here Co is either all of 
R™ or confined to a half-space. Then there exist two convex solids, K and K, 
such that G is gauge function of K and support function of K. 


Proof. 1. We define K by 
K: {x; χε Co, G(x) « 1}. (13.2.45) 


(1) The set K is convex. If x,;, x, ¢ K and 0 <a <1, then 
G[ox, + (1 — «)x,] < aG(x,) + (1 — αὐ G(x,) < a@'14+ 0 -- α) 1 =1, 
so ax, + ([ -- αὐ χ;ε Κ. 


(2) K is closed relative to Cy. For if {x,} < Co is a Cauchy sequence with a 
limit point x, in Co, then G(x,) < 1 together with the continuity of G in Cy implies 
σ(χρ) <1. 

Thus K is convex, closed relative to Cy) and has G as its gauge function by 
virtue of (13.2.45) and the properties of G. 


II. K is defined by 
K: {x; xe Καὶ (x,u) < G(u), ue Co}. (13.2.46) 


(i) K is convex. For if x,, x, € K and 0 < « <1, then 


o(x,,u) + (1 — a) (x2, u) < aG(u) + (1 — αὐ G(u) = σῷ) 
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for all ue Co, ie. 
(ax, + ([ — a) x,,u) < G(u), Vue Co, 


and «x, + (l — a)x,e€K. 


(1) K is closed. If {x,} < K and is a Cauchy sequence in R” with limit xo, 
then for each n and allue C, 


(x,,u) < G(u) 
and by the continuity of the inner product 


lim (x,,U) = (Xp, u) < G(u). 


n->oco 


(11) K is unbounded, if Cy is confined to a half-space. To fix the ideas, suppose 
that ifu = (μι, U2, ..., U,) € Co (Cartesian coordinates!), then uv, > 0. Consider the 
vector x = (—a, 0, 0, ..., 0), a>0O. Then 


(x,u) = —au, < 0, ue Cp. 


It follows that K contains a ray, the negative x,-axis, and is consequently 
unbounded. 

If Cy is all of R”, then |G(w)| < M, some finite number, for all unit vectors u. 
This gives |(x, u)| < M for χε K. For each x, on OK, there is a finite positive ay and 
a unit vector Up such that Χρ = doy and (Xg, Ug) < M gives a, < M,so that all of K 
is confined to the sphere ||x|| < M. 


(iv) G is support function of K. This follows from (13.2.46). Jj 


It may be shown that through every point on 0K, the boundary of K, passes at 
least one support hyperplane—that is, a plane of the form 


(x,u) = G(u). (13.2.47) 


EXERCISE 13.2 


. Prove Lemma 13.2.1. 

. Verify Lemma 13.2.5 and Theorem 13.2.1 when ὦ is in R?. 
. Verify Lemma 13.2.6 in Κ΄. 

. Verify (13.2.9). 


. Verify the assertions made in Example 2. 


N αὶ Ὁ ὦ NO μὰ 


. What is the envelope of the family of straight lines 
xcos@ + ysin@ = sin* sec @,|0| < 42? 

7. Let k, s, t be positive numbers, k fixed. Show that for all s, ὦ 
( Ὁ ἘΠῚ cat oka + “ἘΠ τ. 

[ Hint: What is the maximum of the right member for fixed s?] 
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10. 


11. 


12. 


13. 
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. Prove that y“*!x7* is subadditive in the right half-plane and determine the 


indicator G. 
Find the convex domain K for which A(@) = sec 0, [0] < 42, is the support function. 


Rotate the set {(x,v);0 < y < x*, 0 < x} about the x-axis. Show that the open 
solid of revolution obtained in this manner is an angular semi-module © and find 
C[ 1, its convex hull. 


Let x = (x1, χ, Χ2) be a vector in Ε΄ and show that the mapping x > F(x) = 
(x, + Re es) is subadditive in the half-space x, > 0. 


If u = (cosa, cos f, cosy) is a unit vector in Κ΄, find the indicator G of F with Fas 
in Problem 11. This is a function of y alone, say A(y). 
With A(y) as defined, show that the family of planes 

x, cosa + x,cosP + x,cosy = h(y), |y| < ἐπ, 


envelops a paraboloid of revolution. This is the surface of the convex solid K with G 
as support function. 


. Determine the convex solid K having the same G as gauge function. Show that it is 


a sphere. 


. Verify (13.2.21). 

. Verify (13.2.23). 

. Prove Theorem 13.2.9. 

. Verify [uvw] = (u x v):w. 

. Show that (13.2.27) satisfies (13.2.23) with equality. 

. Verify that (13.2.26) is a solution of (13.2.25). Determine a subadditive function 


F(x) which has A(x) as radial growth indicator. Is F unique? Determine the 
corresponding convex solids, K and K. 


. Verify the assertions made in Example 5. 
. How would you prove part II (iv) of Theorem 13.2.11? 


. Try to prove that there is a support plane passing through every point of the boundary 


OK of a convex solid K. 


13.3 CONVEX FUNCTIONS 


In formula (13.1.1) we take 


xoy=4(x+ y), G(u, v) = 4(u + v), (13.3.1) 


so the inequality reads 


7: + y)] « Σ[ΥὉ αὶ + fO)I. (13.3.2) 


This inequality, first considered by Otto Hélder in 1889 and in more detail by the 
Danish mathematician and telephone engineer J. L. W. V. Jensen (1859-1925) in 
1906, characterizes convex functions. If —fis convex, then fis said to be concave. 
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The geometric meaning of (13.3.2) is that the midpoint of any chord of the curve 
t = f(s) lies above or on the curve. Here “curve”? means any, not necessarily 
rectifiable or continuous, graph. 

Actually a convex function is either continuous or very irregular, unbounded 
in every interval. Thus a measurable convex function is actually continuous. We 
shall ultimately prove this under the additional assumption that f is finite-valued. 
This will be postponed until we have proved some implications of (13.3.2) with or 
‘without assuming continuity. 


Theorem 13.3.1. Let f be defined in (a, b) and satisfy (13.3.2) whenever x and y 


belong to (a, δ). Let x1, Xz, ..., X, be n points in (a, δ). not necessarily distinct. 
Let ry, rz, ..., ἵῃ be n non-negative rational numbers of sum unity. Then 
ΚΣ 1) S 2 rif (x;). (13.3.3) 
S ee “- 


The inequality remains valid if some or all of the weights r; are irrational, 
provided f is continuous. 


Proof. The first step is to verify (13.3.3) if nm = 2™ and all weights are equal, 
r;=2°™. From (13.3.2) we get for m= 2 


4 fla(%1 + x2 + x3 + x4))] < 2/4, + x2)] + 2226: + x,)] 
< f(x1) + f(x2) + f(%3) + f(%4). 


Complete induction shows then that 
1 n 1 n 
f(— Σ x,) SSG) (13.3.4) 
A Ξι WH ;Ξὶ 
holds for all of the form n = 25, To extend the formula to arbitrary positive 
integers we adapt an argument due to Cauchy (1821) as follows. 
_ Suppose that (13.3.4) holds for some value of n and for all choices of n points 
in (a, δ). It will be shown that the formula holds also for the preceding integer 
n —1 (retrogressive induction). Let x,, x2, ..., X,—, be given and set 


i= (xX, +X. +++ + X,-1). 


n—1 
We have then 


fe) πίττα τα, τ xd) =f[- οἱ tat ta dts] 


oe 
< χοῦ + flea) + + A001 + LPG) 


which is (13.3.4) with n replaced by nm — 1. Since (13.3.4) holds for all powers of 2, 
this artifice enables us to extend the validity of the formula to all positive values of n. 
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If now rj, Ip, «εις fy are n given non-negative rational numbers of sum unity, we 
can write 
P1 P2 ΝΕ Pn 
a ae fg, Ss “669 ΤῊΣ Ss 
η 4 4 


where q is the least common denominator of the given fractions and the p’s are 
positive integers <q or 0. We then apply formula (13.3.4) where n = g and each x; 
is repeated p, times. This gives (13.3.3). 

If f is continuous, the assumption that the weights r; are rational may be 
dropped, and we can allow arbitrary non-negative real aumbers: Such numbers 
can be approximated arbitrarily closely by rationals for which (13.3.3) is known to 
be valid. Passing to the limit with the r’s on both sides of the inequality and using 
continuity, we obtain the desired result. Jj 


In particular, we see that the continuity of f permits us to generalize (13.3.2) to 


flax + ( -- ay] < of(x) + ( -- α) f(y) (13.3.5) 


valid for any « between 0 and 1. 
We proceed to a further discussion of the geometric implications of (13.3.2) 
and (13.3.5). Take four values of x 


C= << ty = χε ay Φ ὃ 


and set P; = [x;, f(x,)]. Form the polygon [P,, P2, P3, P4, P,]|. It is a convex 
closed polygon, i.e. if two points P and Q lie inside or on the polygon, so does the 
line segment joining P and Q. Since fis a convex function by assumption, any one 
of the arcs of the curve t = f(s) bounded by two points P; and P,, 1 <j<k.< 4, 
lies below or on the corresponding chord P,P,. Here the reader is advised to draw 
a figure. If m,, denotes the slope of the straight line through P,; and P,, there are a 
number of inequalities between the slopes implied by the aeomictty of the situation. 
The following is basic: 


N12 < M3 -S M34, (13.3.6) 
but we have also | 
7,12 SMyz3 S Mya, Myz SM 23 S "114. 
(13.3.7) 
My4 SM 4 S M34, M13 S M23 S M34. 


If we now express the slopes in terms of the coordinates of the endpoints of the 
chords, a number of inequalities result of which we shall list only those given by 
(13.3.6), namely, 


F(a) — αν Ταῦ — S02) © fas) — Ss) 


< < (13.3.8) 
X2— X41 X3 — X2 X4 — X3 


The inequalities have a number of implications. Set 


xX, =x, +A, Xg=X3+4h, 0< A, 
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to obtain 


] 
πῖλον +) SON < Tf +) το]. (339) 


This states that the difference quotient 


I 
" [f(x + A) -- f(x)] (13.3.10) 


is, for fixed h > 0, a non-decreasing function of x. In the same manner one shows 
that, for fixed x, the difference quotient does not increase as h decreases toward 
zero. 


Similarly, setting 
X= X— hs, X,»=x-—hy, Xz = X, 0<h, <hy, 


we obtain an inequality which reduces to 


] l 
7, LI) ~ Fer -- had «τ Lf) -- fe - hy] (13.3.11) 


This shows that the difference quotient 


J 
> Lf(x) — f(x — A)] (13.3.12) 


does not decrease as ἢ decreases to zero, x being fixed. Similarly, we show that for 
any choice of positive h and h, 


Ι Ι 
50) --ὐᾺ -- AN] τως; — ὐ 0]. (13.3.13) 


In particular, for h, = h this gives 
T(x +h) -2f() + f(x —A) > 0. (13.3.14) 


Thus the second difference of a convex function is non-negative while the first 


difference of a continuous convex function is increasing as a function of x for fixed 
h>0.§j 


We can now prove 


Theorem 13.3.2. A function f which is continuous and convex in an interval 
(a, δ) has a left hand derivative ἢ f(x) as well as a right hand derivative D* f(x) 
at each point x of (a, b) and 


D7f(x) < Dt f(x). (13.3.15) 


If f has a unique derivative in some subinterval (c, d), then that derivative Df (x) 
is a non-decreasing function of x in (c, a). 


Proof. This follows from the properties of the difference quotients (13.3.10) and 
(13.3.12). Let us consider the latter. For fixed x, the quotient does not decrease as h 
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decreases to zero and by (13.3.13) it is bounded above. Hence it has a finite limit 
D7 f(x). Similarly, the quotient (13.3.10) is for fixed x a non-decreasing function of 
h as h decreases to zero and, again by (13.3.13), it is bounded below. Hence 
D* f(x) exists and (13.3.15) shows that the limit of the left hand member cannot 
exceed the limit of the right, so that (13.3.15) must hold. 

Suppose now that D7 f(x) = D*f(x) = Df(x) for x in some interval (c, d). 
If now c < x; < x3 < 4, we conclude from (13.3.9) that 


| I 
Df(x,) = im [ f(x, + ἢ) —f(x1)] < oa [f(x3 + h) — f(x3)] = Df (xs) 
as asserted. ἢ 


We have 
limh-* [f(x + h) — 2f(x%) + f(x — A) =f''(x) (13.3.16) 


h-0 
at all points where the second derivative exists. 
These observations lead to 


Theorem 13.3.3. If f is continuous and convex in (a, b) and if f is twice 
differentiable, then 


7 α)ὺ 20, Vx. (13.3.17) 


Conversely, if f has a non-negative second derivative in (a, b), then f is convex 
there. 


Proof. The first assertion follows from (13.3.14) and (13.3.16). For the second 
assertion the elementary calculus argument goes as follows. If /’’ exists and 
satisfies (13.3.17), then /’(x) exists, is continuous and is non-decreasing. There is a 
unique curve tangent, the curve lies above or on its local tangent and the graph is 
concave upwards; i.e. f is convex. For an alternate proof, see Problem 3 of 
Exercise 13.3. I 


For the following theorem cf. page 401. 


Theorem 13.3.4. Let f be convex in the open interval (a, δ) and bounded above 
in some closed subinterval [c, 47: then f is continuous in (a, δ). 


This is a special case of corresponding results for the m-dimensional case 
which is next on the agenda. 

Let D be a convex domain in Κ΄, 1.6. an open connected set such that 
X,, X, € D, and 0 < a <1 implies that ax, + (1 — «)x,¢D. Let f be a mapping 
from R™ to Εἰ defined on ἢ. Then fis said to be convex in ἢ if 


FUR, + Χ2}} < 21/1) + f(%2)] (13.3.18) 


for all x,, x, in D. 
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Theorem 13.3.5. If f is defined in D and convex in the sense of (13.3.18), then 


for any choice of n vectors x; in D and any non-negative rational numbers r; of 
sum unity we have 


f| > πῃ < Σ r; f(X;). (13.3.19) 
j= j= 
If f is continuous, irrational weights may be allowed. 


Proof. The proof given for (13.3.3) shows that formula (13.3.19) holds for any n 
collinear points in D. To extend to higher dimensions we use induction on the 
dimensionality of the closed convex hull of the points x j- Suppose that the inequality 
is known to be valid provided the points x,, x, ..., X,-, belong to a convex subset 
of D of dimension k. Then for any choice of x, ε ἢ and any choice of non-negative 
weights of sum unity 


SUP, + 2K2 Fee He Xp t+ Xn 


ryXy + 1oX. Ἔ τ᾽ +7, 1Χ,-- 
=f | (1 = ry) tt ra 
i i a εν SE | 
yXy + roX, τῇ τ. + re 
<(Ud-r Sj +S, 
< (1 = rg f {ee σεν τ Ἐπεὶ ΤΩΣ 
n-1 r. 
< 4 = Ty) ΣΝ i F(X) + τ. 
c= Fp 


This is (13.3.19) for an arbitrary number of points spanned by a convex hull of 
dimension k +1. In deriving this inequality we have used (13.3.18) and the 
induction hypothesis. Thus (13.3.19) is valid for all n and all points x, in D. Η 


There are various analogues of Theorems 13.3.2 and 13.3.3 for the 
m-dimensional case. They are left to the reader. On the other hand, we have to 
examine questions of boundedness and continuity. 


Theorem 13.3.6. Let x — f(x) be defined in a convex domain D of R™ where it is 


finite-valued, measurable, and convex. Then f is bounded on compact subsets 
of D. 


Proof. Let Do be a compact convex subset of ἢ. Suppose that fis not bounded 
above in Do. Then there exists a sequence {x,} such that (i) x,EDo, Va, 
(ii) lim x, = Xp exists and Xo € Do, (iii) f(x,) > n. Let d be the distance of Xo from 


no 
OD, the boundary of D, and let t be any vector such that {{|| < 4d. For such values 
of t and for large numbers n it is seen that x, + t and x, — t belong to ἢ. Hence 


N<S(X) = SS, + Ὁ + 3%, -- Ὁ] < Ξ2[λχ, + 0+, -- Ὁ] 
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and either f(x, + Ὁ or /(x, — Ὁ must exceed n. Set 
S, = [x3 ΙΧ -- ll < 3d f(x) > η]. 
It is a measurable set and its measure is at least 
1 (ὦ 4)" V,, (13.3.20) 
where V,, is the measure of the unit sphere in R”. Now set 
T, = [x; |x — Xo <3d, f(x) > π|. 


This is also a measurable set and for large values of ἢ its measure differs arbitrarily 
little from that of S, since the measure of the symmetric difference of the two 
spheres 

[x; ix —x,|] <4d] = and —[x; |x — Χοὶ < 34] 


goes to zero asn— οὐ. We now note that 
TS A ao SS Og, 


so that T is a measurable subset of D of measure at least equal to (13.3.20). But in 
T all the inequalities f(x) > n are valid so that f(x) = + οὐ, for all xeT. This 
contradicts the assumption that fis finite-valued in D. It follows that there exists an 
M = M(D,) such that 

f(x) < M(D)), VxeDp. (13.3.21) 


Next, suppose that fis not bounded below in Dy. Then there would exist a 
sequence {s,} such that (i) s,€Do, for all n, (ii) lims, = 580 exists and is in Dy 
(iii) f(s,) < —2n. Then for te Do ie 


SEG, + Ὁ] < 4/(s,) + 4 f(t) « --π + M(Do). 
Set 
υ, = [x; : +(S, a t), te Dol, 
Uy = [x; x = (Sp + Ὁ, te Do]. 
Let V, = Ups, U,. Then V, > V4; > +: DOV, = V. Now V coincides with the 
set U, which is of positive measure. Further, in V all inequalities f(x) < —n are 
satisfied. This implies that f(x) = -- οὐ for all x in V and again this is a contra- 


diction, since f can nowhere take on the value — oo. Hence fis bounded below as 
well as above in Do. J 


Theorem 13.3.7. A measurable finite-valued function convex in the convex 
domain D is continuous on compact subsets of D. 


Proof. Let Dy be a convex compact subset of ἢ. By the preceding theorem there 
is a finite B = B(D,) such that | f(x)| < B, for allxe Do. Set forxe Do, x + he Do 


6*(x) = "πὶ sup [f( + h) — f(x), 
|hi| lo 


δ΄ (x) = lim inf [f(x + h) — f(x«)]. 
[hl] 10 
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These quantities are uniformly bounded in Dy. We have then 


f(x +h) =fG +b) + $x] < dfx +h) + f(x), 
so that 


S& + 88) — f(x) < ΣΤᾺ + b) — f(X)]. 


Hence 
6*(x) <46*(x) or S*(x) «0. 


We have also δ΄ (x) « 6*(x) < 0. Since f is convex 


f(x + "αὶ — f(x) > —[f(« — h) -- f(x)], 


so that 
lim inf [f(x + h) — f(x)] > lim inf {—[/(« -- h) -- f(x} 
[{h}|1o |[hI} 10 
= —lim sup [f(x — h) — f(x)] 
[|h] {10 
or 


d(x) > —d7(x) > 0. 
Hence δ΄ (x) = 0 and the same holds for 5*(x). It follows that 


lim |f(x + h) — f(x)| = 0 
[h{| +0 
for all x in Do. Thus local boundedness implies local continuity. Jj 


Corollary 1. Theorem 13.2.4. 


Corollary 2. Theorem 13.2.8. 


EXERCISE 13.3 
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1. Let fand F be continuous, real-valued, twice differentiable functions, and let F be 
defined on the range of f If fis convex, find sufficient conditions for ΕἾ f(x)] to be 


convex. 


2. If fis positive, find sufficient conditions for log f to be convex. 


3. Prove the second part of Theorem 13.3.3 by showing that such a function satisfies 
(13.3.3). Setu = Lr, f(x;) and expand f(x) in powers of u — x; by Taylor’s theorem 


with remainder. 


4. If fis a continuous function, convex in (a, δ), derive the inequalities (13.3.8) directly 


from (13.3.5). 


5. If fis continuous and convex, show that 


1 b 
fi4@+ bd] « ἘΞ ! f(t) dt. 
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10. 


11. 


12. 


13. 


14. 


15. 
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. Let g be non-negative and continuous and let p >1. Show that 


1 b p 1 b 
ez | σαι < | [g(t]? de. 


b—a a 


Use properties of convex functions and the definition of the integral by Riemann 
sums. The inequality is simply (4.4.17). 


. Let g be real and continuous. Show that 


1 b 1 δ 
exp [— | g(t) a < — | exp [g(t)] dt. 


. Let g be positive and continuous. Show that 


—a 


1 " 1 ὃ 
—— | log [g(t)] dt < log i | g(t) a| 
b == a b a 


. If fis continuous and convex in (a, 6) and if a + h < x < b — h, show that 


1 xth 
ΓΟ) « 3, [fou 


Actually this condition is sufficient as well as necessary for convexity of a continuous 
function. 


Denote the right member of the preceding inequality by T[ f](x). Use the sup-norm 
in C*[a,b] and show that ||7 || =1. Show that f(x) = T[f](@) is satisfied by 
ax + B for any non-negative numbers « and β. | 

Let D be a convex domain in the plane, fa continuous and convex function defined 
in D. Formulate and prove some generalizations of Theorem 13.3.2 for this case. 
Suppose that f has continuous second order partials. How does Theorem 13.3.3 
generalize? 

Can the surface z = f(x, y) have saddle points? 

Let f have continuous derivatives of order up to and including the third. Expand 
f(x, y) in powers of (x — χρὴ and (Ὁ — Yo) for (%9, Yo) € D. In the Taylor expansion 
the second order terms form a quadratic form. What can be said about this form? 
Check the proof of Theorem 13.3.5. 


The next four problems are accredited to E. G. Eggleston (1966). 


16. 


17: 
18. 


19, 


If fis continuous and convex in R™ and f(0) = 0, prove that r~+f(rx) increases with r 
for any fixed x. 


Prove the same is true for Κ΄ [Χορ + rx) — f(Xg)] for fixed x and xp. 
Show that the first differential 
Of (Xo; h) = lima *[ f(xy + ah) — f(xo)] 
ajO 


exists for all x9. Note that ἃ is restricted to positive values. 
If fis a gauge function, show that df(X9;h) < f(h). 
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13.4 NON-ARCHIMEDEAN VALUATIONS 


This originally purely algebraic concept has some aspects involving functional 
inequalities on a product space. The algebraists are concerned with certain map- 
pings of a field or of a ring into R. For our purposes we may restrict ourselves to 
the case where the domain of definition of the mapping is R, Κ΄ or Z*. In the 
notation of Section 13.1 

o= 4, G(u, v) = max (uy, v) (13.4.1) 
and the inequality is 


7 + y) < max[f(x), f(y)]. (13.4.2) 


With what N. Bourbaki would call “‘un abus de langage’’ we refer to such a 
function as a valuation of the domain. 

If D = R*™ or Z*, then any decreasing function of x satisfies (13.4.2). This 
means that in Κ΄ a valuation may tend arbitrarily fast to + 00 as x decreases to 0 
and/or arbitrarily fast to —0oo as x increases to +00. 


Theorem 13.4.1. A measurable valuation defined on R* which assumes the 
value + 00 at most in a set of measure zero is bounded above on compact subsets. 


Proof. We use the same argument as in the proof of Theorem 13.1.1. If the 
assertion is false for a particular valuation Κα then there is an interval [a, δ], 
0<a<b< οὐ anda point set {s,} such that (1) a<s, < ὁ, (2)lims, = 80 > a, 
and (3) f(s,) > for all nm. Then forO<s<s, 


n<f(s,) Ξε (ὦ, — 5. + 5) < max[ f(s, — 5), f(s)]. (13.4.3) 
It follows that either f(s) >” or f(s, — 5} > n and the measurable set 
δ, = {t:n<f(t),0<t<-s,} 


has measure 
mLS,] > 45, > 4a. 


From this point on the argument follows that given in the proof of Theorem 13.1.1 
without modification. The reader should convince himself that this is indeed the 
case. fj 


Actually a stronger statement than Theorem 13.4.1 may be made. 


Theorem 13.4.2. A measurable valuation on Κ΄ which equals + 00 at most in a 
set of measure zero is bounded above on every interval [a, 00) where Ὁ < a. 


Proof. Suppose that 
GQ) <M for a<x<2a. (13.4.4) 


Such an M exists by the preceding theorem. If now 2a < x < 4a, thena <4x < 2a 
and 


f(x) =f (ax + 4x) < max [fGx), fGx)] < M. 
The proof is completed by induction. ἢ 
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Though f(x) is bounded above and conceivably also below, it need not tend to 
any limit as x > oo. Thus the Dirichlet function f(x) which is 0 or 1 according as x 
is rational or irrational is a valuation on R and does not tend to any limit. 

In the discussion of CebySev constants in Section 15.6 we shall consider 
valuations on Z*. Here the condition 


Am+n < MAX (Ans An); 0 « m,n, (13.4.5) 


implies that a, is bounded above but implies neither boundedness from below nor 
the existence of a limit of a,. The convergence properties of valuation sequences 
show some peculiar features some of which are illustrated in Exercise 13.4. 

The generalization to 


f(xoy) < max [f(x), f(y] (13.4.6) 


may possibly be of some interest. 


EXERCISE 13.4 


. Check the proof of Theorem 13.4.1. 

. What is the analogue of Theorem 13.1.2? 

. Extend Theorems 13.4.1 and 13.4.2 to valuations on R. 

. Suppose that fis a non-negative valuation on R and that lim f(h) = ζ(0) = 0. Show 

that f is continuous everywhere. ay 

5. If f(n) = a, is a valuation on Z *, show that a, <a, for all n. Is the sequence 
necessarily bounded below? 

6. A sequence {a,} consists of the block 1, 1, 0 repeated infinitely often. The cluster 
points of the sequence are obviously 0 and 1. Show that it is a valuation sequence. 

7. Show that if a valuation sequence {a,} has a unique limit ὁ, then ὁ < a, for all n. 

8. Show that every integer >/j* admits of a representation of the form mj + n(j + 1) 

* where m and n are non-negative integers. Is the representation unique? 

9. Show that if two consecutive entries in a valuation sequence {a,} satisfy a; <a, 


G;4, <a, then lim sup a, < a. 
"1. 00 ; 
10. The following is the most common algebraic valuation, the p-adic case. It is defined 


on Q, the field of rational.numbers. Let p be a given prime. If x = ἤσίμ ΕΟ, set 
x = p' a/b where a, b, m, n, t are integers and a and ὦ prime to p. Define f(x) = —1t 
and show that fis a valuation on Q. In higher algebra it is customary to demand that 
a valuation satisfy f(xy) = f(x) + f(y) as well as (13.4.2). This condition is 
evidently satisfied by f(x) = —t. 


ὮὌ ΩΣ Ὦ = 
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14 FUNCTIONAL EQUATIONS 


Numerous functional equations have been encountered in earlier chapters of this 
treatise. Thus in the discussion of fixed point theorems we were concerned with 
a mapping 7 - T[f] of a metric space X into itself. The invariant elements 
are solutions of the functional equation 


f= TLS]. 


When T or some power of T is a contraction, existence and uniqueness of a 
solution could be proved. Examples were discussed in Chapter 12. 

To any functional inequality in a product space corresponds a functional 
equation. Thus to (13.1.2) corresponds 


f(xoy) = GLF (x), £1. 


Some instances of this and of the more general equation 


flo(x, y)] = HLS (x), f(y), xy] 


will be discussed below. Here g and H are given and f is to be found. We 
have here a much greater freedom of choice in the mappings than in Chapter 13. 
Not merely the domain but also the range is at our disposal. 

Differential and integral equations are usually excluded from consideration 
by workers on functional equations as an economy of effort. We shall not do this. 
The field is vast and our selection of material may appear haphazard. Each 
functional equation creates its own problems and there is little of general theory 
available. We have tried to emphasize the a priori aspects: what information 
about the solutions is hidden in the equation and obtainable without solving the 
equation? 

There are three sections: ““Cryptoanalysis”’; Cauchy’s equations and generali- 
zations; and Uniqueness theorems. 


14.1 “CRYPTOANALYSIS” 


Mathematics is full of codes which have to be deciphered. Thus any functional 
equation contains a message more or less well hidden. All properties of the 
solutions are built into the equation and our problem is to bring them out by 
asking appropriate questions. Note that this information is available before an 
explicit solution is obtained, if it is obtainable. In the following an effort will 
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be made to show what type of a priori information is obtainable and how to obtain 


it. This type of “cryptoanalysis” will be illustrated by special examples. There 
is no general theory at our disposal. 

First let us remind the reader that it is necessary to specify what mappings 
are to be considered. It is usually necessary to state the domain of f and also 
the range. Sometimes even limitations on the graph {x, f(x)} have to be taken 
into account and may seriously affect the existence and nature of solutions. 
Equations look formally the same but in one case the variables may be real numbers, 
in another matrices, quaternions or vectors. 

Secondly, even after such basic decisions have been made, we note that 
properties of solutions may be of two types: categorical and conditional. The 
first type pertains to all solutions in the given range, the second only to solutions 
satisfying some restrictive condition such as boundedness or measurability, etc. 
The first type occurs fairly seldom, the second type is what one normally operates 
with in this theory. 


Example 1. Take the simple first order differential equation 
w’(z) = w(z). (14.1.1) 

Here the natural domain and range are the complex plane C. The following is 
an incomplete list of categorical properties: 

(1) If w(z) is a solution, so is w(z + a) for any fixed a. 

(2) w(0) w(z + a) = w(a) w(z) for any fixed a. 

(3) Either w(z) ¥ 0 for all z or w(z) = 0. 

(4) w”(z) exists for all n and equals w(z). 

(5) w(z) is an entire function of z. 

Verifications are left to the reader. 


Example 2. For a contrast, take one of Cauchy’s equations 


f(x + y) = f(x) + 0), (14.1.2) 


which will be studied in detail in the next section. We consider mappings ἢ of 
reals into reals defined for all real x. Among the categorical properties we note 
that f(0)=0 and f(rx) =nf(x), more generally, f(gx) = qf(x) for any 
rational g. On the other hand, measurability or boundedness on compact 
subsets are not categorical, nor is the relation f (rx) = rf (x) for any real number 
r true for all solutions. 


Example 3. The functional equation 
R(s) — R(t) = (t — s) R(s) R(t) (14.1.3) 


is basic in resolvent theory where it is known as the first resolvent equation. See 
Sections 1.5 and 9.2 where explicit solutions are to be found first for matrix 
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algebras and then for a general non-commutative Banach algebra 8 with unit 
element e. Let us consider the latter case. Here s is a complex variable; R(s), 
a %B-valued function, is defined and satisfies (14.1.3) in some domain D of the 
complex plane. 

One categorical property is discernible: R(s) and R(t) must commute since 
the equation is unchanged if s and ¢ are interchanged. Other properties appear 
to be conditional. We shall show under mild restrictions on R that this 
function will be B-holomorphic in the terminology of Section 8.1. More precisely 
we shall show that 


local boundedness => continuity = differentiability > analyticity. 
Suppose that in a disk [5 — αἱ < r contained in D we have 
|R(s)|| < M(q@). (14.1.4) 
This implies 
IR(s) — ΚΟΩΙ < [M@}]* |s — ἢ, 


a Lipschitz condition. Hence a locally bounded solution is continuous. Further, 
a continuous solution is differentiable since 


R(t) — R(s) 


= ἐς R(s) R(t) > — [R(s)}? (14.1.5) 


as ts, so that R’(s) exists and 
Κ΄) = — [R(s)]’. 
This implies the existence of derivatives of all orders and 
R(s) = (—1)"n![R(s)}"*1. (14.1.6) 


Cf. Section 9.2. Thus locally the solution of (14.1.3) is given by 
R(s)= ¥ 451 -- 5)’, A = R(a). (14.1.7) 
n=0 


The power series converges at least for |s — αἰ < [||Al]]~! and we know con- 
versely that any such power series satisfies (13.1.3) for s and ¢ in the circle of 
convergence. 

The following examples are concerned with a priori estimates of rates of 
growth and nature of infinitudes of solutions of ordinary differential equations. 
A special case of such a problem occurs in Theorem 12.4.2. This type of dis- 
cussion does not seem to have made much headway for other types of functional 
equation. Normally it appears to be irrelevant, but it should be meaningful for 
difference equations and for functions defined by addition theorems. 
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Example.4._ We start with the non-linear first order differential equation 
w'(z) = 14 [w(z)]’. (14.1.8) 


Domain and range are the complex plane. Since the equation is autonomous, 
1.6. unchanged under the shift z > z + a, we can exhibit one categorical property 
right away: if w(z) is a solution, so is w(z + a). Another categorical property is 
involved in the following trichotomy: either w(z) # +7 for all z or w(z) =i or 
w(z) = —i. This follows from the fact that there is one and only one solution 
which takes on the value +i at z =a and the same holds for —i. 

Another categorical property may be formulated as follows: if w(z) is a 
non-constant solution of (14.1.8), then w(z) necessarily becomes infinite for z 
tending to some finite value ὁ. Moreover, the infinitudes are simple poles of 
residue —1 and for any given value ὁ there is one (and only one) solution which 
admits of z= ὦ asa pole. This naturally follows from the fact that any non- 
constant solution of the equation is of the form tan(z — a). But this is such a 
basic fact that it should be obtainable directly from the equation without the 
detour via the explicit solutions. There are at least two ways of accomplishing this. 

One of the methods leads to another categorical property of the solutions. 
There is a one-parameter family of linear fractional transformations 

Se (14.1.9) 
Ι-- cw 
which leaves the equation invariant. Here c is any real or complex number. 
Thus if w(z) is a solution of (14.1.8) so 15 


w(z) + ¢ 
1 — cw(z) 


The connection of this result with the addition theorem of the tangent function 
is obvious. See Problem 5, Exercise 14.1. For our purposes the transformation 


(14.1.10) 


v= -- — (14.1.11) 


is more useful. Actually it may be obtained as a limiting case of (14.1.9) by letting 
c-— oo. This change of dependent variable gives 


v'(z) = 1+ [v(z)]’, (14.1.12) 


i.e. reproduces the original equation except for notation. Now it is clear that this 
equation has a solution which assumes the value 0 at z = b, say v(z;b,0). The 
equation says that the derivative takes the value 1 at z= 6. This means that 
(14.1.8) has the solution —[v(z;5,0)]~* with a simple pole at z = b where the 
residue is —1. This is an a priori verification of the existence of solutions with a 
simple pole and residue —1 at a preassigned point. 

The second method is quicker but less precise. We suspect the existence of a 
solution of (14.1.8) which becomes infinite as z approaches some point b. At 
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such a point w’(z) and [w(z)]* both become infinite and, comparing orders of 
infinity, we should get some inkling of the facts. If « and f are positive 
numbers, suppose that 


w(z) ~ B(z — δ) “ (14.1.13) 
up to terms of lower order. Then, formally, 
w(z)~ —oB(z—b)*"*, [ν(2}]" ~ Be - δ) *, 


again up to terms of lower order. Writing down the condition that leading 
terms on both sides of equation (14.1.8) must cancel, we get --αἀ —l= — 2a, 
--αβ = B*, or a=1, B = —1. Thus we are justified in expecting simple poles 
of residue —1 at the singular points. We know that this is correct. Since ὁ is 
arbitrary, we sy that equation (14.1.8) has movable singularities which are simple 
poles, ar othe categor cal property. 

Our last examy 1/2: is more sophisticated, and here the last word has not been 
said. Sce P. J. Rijnierse (1968) and E. Hille (1969). 


Example 5. The equation 
γ΄) = x Τγ»Ο)]}" (14.1.14) 


was introduced in nuclear physics by L. M. Thomas and Enrico Fermi inde- 
pendently of each other in 1927. It is known as the Thomas—Fermi equation. 
We are not concerned with electrical fields in an atom; just a priori properties of 
solutions of the equation regardless of physical significance, if any. 

It is natural to restrict oneself to real positive values of x and to solutions 
with positive values in some interval (c,d). The equation then shows that 
γ΄ (Χ) > 0 in (c, 4), so the graph of y is concave upward and there can be at most 
one (positive) minimum in the interval. The equation has movable singularities 
(categorical property of the equation) and actually of two different kinds according 
as y tends to zero or to infinity as x approaches the point x = ὁ in question. The 
existence of points of the first kind is obvious since the initial values y(b) = 0, 
γ΄ (δ) = c #0 are admissible and determine a solution which is positive to the 
left of x = b if c < 0 and to the right of ὁ if c>0. This turns out to be an 
algebraic branch point. The second type of singularity where y becomes infinite 
is less obvious but is equally common. As a matter of fact, as soon as a solution 
starts to grow it is doomed to become infinite for some finite value of x. This 
observation seems to go back to L. Brillouin (1934). What happens to the 
solution beyond the critical value appears to be a moot question. 

Let us see what information the second method under Example 4 can give 
in this case. Suppose that as x increases toward some number ὃ 


y(x)~ Bb-x)*% «a>0, pd, (14.1.15) 
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up to terms of lower order. Again, proceeding formally, we have 
χ 1/2 ᾿ς; bole. | [γ( 7] “a pe (b as 9 cae 

y"'(x) ~ oB(a +1) ὦ -- x) Ὁ 
up to terms of lower order. Comparison of dominant terms gives 


a=4, B= 4008. (14.1.16) 


So far so good, but there is no Laurent expansion at x = 5, so the singularity cannot 
be a pole. | 

Nevertheless, there do exist solutions which become infinite as x increases 
to a preassigned point x = b > 0 and which satisfy the inequality 


y(x) < 400(b — x)~*. (14.1.17) 


This also follows from a prioristic considerations but it would take us too far afield 
to develop these more sophisticated methods. 


EXERCISE 14.1 


1. Verify the assertions under Example 1 above. 
2. Discuss the equation 
(s — ἢ S(a@) S(s) S(t) = (s — a) S@) -- ( -- α) S@ 
along the lines of Example 3 above. Assume S(s) € 8 and to be defined and bounded 
in a domain D containing the point a. 
3. Discuss the dissolvent equation 
sD(s) — tD(t) = ὦ — t) D(s) (ὴ) 
along similar lines. 88 need not have a unit element and the domain D should not 
contain the origin. 
4. Verify that (14.1.9) leaves (14.1.8) invariant. What happens when c = ior —i? 
5. From this fact derive the addition theorem for a non-constant solution of (14.1.8). 


6. Determine the probable nature of the movable singularities in the case of the equation 
w(z) =1 + [w(@2)]’*. 

7. Same question for [w’(z)]? =1 + [w(z)]*. 

8. Consider the equation R’(z) = —[R(z)]? in the setting of Example 3 where R is 
$8-valued in some domain of the complex plane. The a priori methods of Example 4 
suggest that a movable singularity should be a simple pole with an idempotent as a 
residue. The discussion in Section 1.5 shows that this hypothesis is wrong already 
for matrices. Where is the error in the reasoning? 


9. Could a solution of this equation be algebraically singular wherever it exists? 


10. In Example 2 it is stated that f(¢x) = gf(x) for all rational numbers g when fis a 
solution of (14.1.2). Prove this. 
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14.2 CAUCHY’S EQUATIONS AND GENERALIZATIONS 
Cauchy in the 1820’s studied the four equations 


7 + y) = f(x) + f(y), (14.2.1) 
fx+y=fMOSFO), (14.2.2) 
f(xy) = fMSFQ), (14.2.3) 
f(xy) = f(x) + f(y). (14.2.4) 


They are closely related. For the time being x and y are real and f real or 
complex valued. Cauchy proved that if f is a continuous solution of the first 
equation, then there exists a constant a such that 


F(x) = ax, Vx. (14.2.5) 


Now actually, if f is continuous at a single point, then it is continuous 
everywhere and (14.2.5) holds. Suppose that f is continuous at x = δ. Then 
f6+h-f®)=fh)-0 with ἢ. 

Hence 
fist+h)—-f(s)=f(A)-0 with A, 


so that f is continuous also at x = 5, i.e. everywhere. This observation is due 
to the French mathematician Gaston Darboux (1842-1917) in 1875. The 
introduction of measurability and the Lebesgue integral led to further results. 


Theorem 14.2.1. If a solution f of (14.2.1) is Lebesgue integrable over finite 
intervals, then (14.2.5) holds. 


Proof. We have 


1 1 
f= Ϊ fet y)dy— I fiy)ay 


xt+1 1 
- | f(s)ds— | I (s) ds. 
x 0 
Now a definite integral is a continuous function of its limits because 
ἢ 76} 5 +0 with m[S]. (14.2.6) 
5 


This makes f continuous, and the integral of a continuous function is 
differentiable with respect to variable limits of integration. Hence f is differ- 
entiable, and the usual formulas of the calculus give 


70) =f +)-f@)=f0) or f@&=f()x, (14.2.7) 
since f(0) = 0. 
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For real-valued solutions one-sided boundedness on a set of positive measure 
or, as an alternative, measurability on such a set is sufficient to ensure 
continuity. The first result stated is due to Alexander Ostrowski (1929), who 
also proved that a non-measurable solution must be unbounded on every set of 
positive measure. 

In the meantime, non-measurable solutions had been constructed in 1905 
by Georg Hamel (1877-1954). The gist of Hamel’s construction 15 the 
following. On the basis of the axiom of choice one proves that the reals can be 
well ordered and this implies the existence of a non-denumerable basis {x,} for 
R‘ in terms of which each real number x has a unique representation 


x=) q(a) x, (14.2.8) 


where the coefficients g(a) are rational numbers and the sum involves only a finite 
number of summands. We choose now an arbitrary non-denumerable set of 
real numbers {z,} and define f on the basis elements by 


f(x) = 2 Vo, (14.2.9) 
and set 


f(x) = Σ᾽ g(a) z,. (14.2.10) 


This defines f(x) uniquely for all x. Moreover, f satisfies (14.2.1). For if x is 
given by (14.2.8) and 


y=) ra) χα: 
then 
xt+y=)[a@+r@lx, 60 Ὁ )) =) [4(0) + r(@)]z,, 


so that fis a solution. Note that in the expression for y, members belonging 
to other «’s than in x might be different from Ὁ but there is still only a finite 
number of coefficients different from 0. Moreover, if not all z,’s are proportional 
to the corresponding x,’s, then this solution is not of the form (14.2.5), so it 
cannot be continuous or measurable. 

Solutions of (14.2.1) obviously have the property 


f (nx) = nf (x) (14.2.11) 
for all positive integers n. This implies that 
f(rx)=rf) (14.2.12) 


for all rational numbers r. On the other hand, the relation need not hold for 
irrational values of r if f is non-measurable. This striking result is due to 
Z. Dardéczy (1961) and L. Losonczi (1964). The following construction of such 
solutions was communicated to the author by J. H. B. Kemperman (1969). 


420 FUNCTIONAL EQUATIONS 14.2 


Let p and o be two distinct real transcendental numbers and let Q(p) and 
Q(o) be the field extensions of the rational field Q obtained by adjunction of p 
and o respectively. The elements of Q(p) are simply rational functions with rational 
coefficients in the mark p and Ο(σ) is obtained by replacing p by o. This 
defines an isomorphic mapping of Q(p) onto Q(c) which we denote by Z. 
Next we “‘construct’”’ a Hamel basis for R* over Q(p), say the set {x,}. Then 
every χε Κ΄ has a unique representation of the form 


x= Σ 4,(α) χ,. (14.2.13) 
where each 4,(α) ε Ο(ρ) and the number of summands is finite. If now 
y= Σ; γρία) χ,» then x+y= Σ [4,(α) + r,(«)] xq. 
Since the representation of a real number in terms of a Hamel basis is 
unique, the last formula is the Hamel representation of x + y. We now define 
70) = Σ, Ζ[4,(0)}} χὰ = Σ, dol) χα. (14.2.14) 


Here Z is the isomorphic mapping of Q(p) onto Q(o) and q,(a) is the image of 
q,(a). Since sums go into sums under Z, we have 


f(x + y¥) = 20 4,@) + γ»(}} χ, 
= ΣΖΙ4,(6)] χ, + DAP (0) χ, 
= Σ dol) Xa ἘΣ ral) Xe = ΓΟ) + 0) 


Thus / satisfies (14.2.1). We shall now prove that 
Ft (px) = of (x), Vx. (14.2.15) 


Here we need that Z acting on Q(p) takes products into products as well as sums 
into sums. Thus 


Ff (px) = Σ, ΖΙρᾳ,(α)] χ, = Σ,Ζ(ρ)Ζ[4,(.)] x. 
= 0) Gq(%) χ, = of (x) 


as asserted. Thus (14.2.12) need not hold for irrational values of r. This 
solution is, of course, non-measurable. 

Richard C. Metzler has called my attention to the fact that the construction 
may be modified so that (14.2.15) holds for a finite number of distinct values of p 
and corresponding values of o, all transcendental numbers. If p is algebraic, 
(14.2.15) requires that o be algebraic—-in fact, p and o must satisfy the same 
irreducible equation. See, further, Exercise 14.2. 

Let us make a brief excursion into the complex field. Equation (14.2.1) also 
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makes sense for complex variables: 


F(Z, + 22) = f (21) + f (22), (14.2.16) 
where, however, there are continuous solutions besides az. We can take 
f(x + iy) = ax + by (14.2.17) 


for any constants a and b. If a and ὦ are real, this is the general form of a 
linear functional on R*. The expression defines a differentiable function of z 
iff ὁ = αἱ. 

We return to real variables. We may say that measurability implies continuity, 
which implies differentiability for solutions of (14.2.1). The same holds for the 
other Cauchy equations which are satisfied by 


er ας alog x, (14.2.18) 


respectively. Here a is an arbitrary real or complex number, and in the case of 
(14.2.4) the variables x and y are positive. 

So far we have assumed f to be a mapping from reals to reals, reals to 
complex, or complex to complex. But there are many other possibilities. Thus 
letting x and y remain real numbers, we can let f define a mapping of Ε΄ into 
Wi,, the n by n matrices over C, and ask for solutions of the various equations 
in this setting. In the case of (14.2.1) any continuous solution would be of the 
form | 

f(x) = Ax, (14.2.19) 


where & is an arbitrary constant matrix. Equation (14.2.2) leads to 


f(x) = exp [4x], 


where, again, & is any constant matrix, the exponential function is defined by the 
usual series, and @ is an idempotent commuting with A. 
We can also turn the tables: let f(0) be a number and and Y matrices. 
The equation 
f(L + ἢ = f(L) + fC) (14.2.20) 


now characterizes the additive functionals on M%,. Note that such a functional 
need be neither homogeneous nor bounded. Assuming these additional 
properties, a solution is given by 


F(&) = 2, py CjrX jx, Where ἃ, = (xx) (14.2.21) 
j=1k= 


and the coefficients ¢;, are arbitrary complex numbers. This formula is obtained 
by writing 


n 


nf = Dy X jx8 jks (14.2.22) 
J=1k=1 


where &,, is the matrix with a 1 in the place (7, Κ) and zeros elsewhere. These 
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matrices form a basis for I, and we define f on the basis elements by 
a (δ...) = ¢ jk 
and proceed by linearity to obtain (14.2.21). 

Equation (14.2.3) in the setting Wt, C now characterizes multiplicative 
functionals. The determinant of 2% is clearly a solution; more generally we can 
take — 

f(L) = [det L]? (14.2.23) 


where a is a fixed number. This makes sense when a and det ὦ» are real positive. 
The general solution found by M. Hosszut (1959) is f(L) = g(det 25), where g is 
multiplicative g(xy) = g(x)g(y) in the domain in question Εὖ or C. 

We may, of course, also allow mappings from Wi, into M,. Now the problem 
becomes more involved. We see by inspection that 


F(L) = ALG ταῦ τ CLD (14.2.24) 
is a solution of 

F(G + VY) = FL) + FY) (14.2.25) 
for any choice of the constant matrices A, B, C, D. This is a continuous 
solution and not even the most general one. The general solution is known, even 
of the more general problem of mapping m by n matrices into p by 4 matrices. 
Since only addition is involved, we can consider an m by n matrix as a vector 
with mn components and the problem is reduced to solving (14.2.1) for vectors. 
With obvious change of notation we are now concerned with solutions of the 
equation 

F(x + y) = F(x) + F(y), (14.2.26) 


where x > F(x) is a mapping from C” to ΟΡ, According to Aczél (personal 
communication, cf. pp. 215-16, 348 of his 1966 treatise) one proceeds as follows. 
Let e;,-:-,@, be a basis of C” and E,,---,E, a basis of ΟΡ, Each component 
F(x) of F(x) is additive: 


F(x, + X,) = F,(x,) + F,(x;), (14.2.27) 


which extends to any finite number of summands. If, now, 
(14.2.28) 
we have 

F(x) = F, ΓΣ x,¢,| = x F,(x,;e;) = by 9 jx (X;), (14.2.29) 


where each scalar function g ;, is additive: 


σμκία + Y) = σκ(Χ) + Gj (y). 
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Combining, we get 


Ρ n 
F(x) = ΡΝ | 2, σία Bo (14.2.30) 
where the g,, are arbitrary additive functions, i.e. solutions of (14.2.1). This is 
the general solution of (14.2.26). If the g,, are continuous, then so is F(x), but in 
general this would not be the case. From the solution (14.2.30) of (14.2.26) we 
can get the general solution of the matrix case (14.2.25) which served as our point 
of departure. The details are left to the reader. 

There 15, of course, no reason for stopping at vectors and matrices. Thus 
(14.2.1) is only another way of writing 


T(x, + x2) = T(x,) + T(x), (14.2.31) 


which characterizes an additive mapping from one semi-group to another. The 
transformation need not be bounded, if ¥ and 3) are metric spaces where bounded- 
ness makes sense. 

In the same spirit, if s and ¢ are real and if T(s) is a mapping from R’ 
into ( (3), the B-algebra of linear bounded transformations from X to X, then 
the equation 


T(s + ἢ = T(s)T(t) (14.2.32) 


characterizes one-parameter groups or semi-groups of transformations. 

The Cauchy equations may be generalized in a different direction. Let o 
denote a binary associative operation, say from R' x R* to R’, let G(u, v) be a 
mapping from Κ΄ x R‘ to Κ΄, and consider the equation 


7 ον») = GLP (x), £1. (14.2.33) 


The Cauchy equations are clearly of this type with o being addition or 
multiplication and G(u, v) = u+ vor uv. This class of equations was studied by 
C. T. Ionescu Tulcea in 1960. His general result is much too complicated to give 
here. The following much-specialized version is due to R. C. Metzler (personal 
communication, 1969). It is stated without a proof. | 


Theorem 14.2.2. Let (5, 1) = sot be a composition on R* x R* to R’* defined 
for (s,t)€D < R* x Κ' and continuous in D. Suppose there exists a neutral 
element e such that for all (e, 1) D we have eot = t. Suppose, further, that 
for (x,t) and (6, μὴ in D we can solve the equation xot=u for x =p,(t) 
uniquely and in such a way that the mapping (u, 1) > p,(t) is continuous. In 
addition, suppose that for each t under consideration there is a compact 
neighborhood N(t) of t and a constant «(t) such that for all se N(t) and open 
sets We N(t), up, *(W)] < «(t)u(W) where μ is Lebesgue measure. Then 
if G in (14.2.33) is continuous on the range of f and if f is measurable, f is 
necessarily continuous. 
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The conditions on o are clearly satisfied by + and - with neutral elements 
Ὁ and 1, respectively. A further case is 


sot= : e= 0. (14.2.34) 


This operation is related to the addition theorem of the hyperbolic tangent and 
to the theory of special relativity. 

The case where sot = s+ ¢ and G(u, v) is a symmetric analytic function is 
known as an addition theorem. About 100 years ago Weierstrass determined all 
functions which have an algebraic addition theorem, i.e. where G is an algebraic 
symmetric function. It turned out that f has to be an algebraic function of one of 
the three arguments, z, 6525, (bz), where g(z) is the elliptic function of Weierstrass. 
This is under the assumption of complex variables. If s and ¢ are restricted to real 
values, the solutions are piecewise analytic and expressible by the same functions. 

The case 


f(s + t) = G[F(s), f()] (14.2.35) 


with f mapping C into a Banach algebra, in particular M,, was examined by 
N. Dunford and E. Hille in 1944. Here G(u,v) is a symmetric analytic 
function. Under fairly general assumptions, a continuous solution will possess 
derivatives of all orders. The value of f(O) is restricted to the roots of the 
equation 


G(a, a) = a, (14.2.36) 


7 (0) can be chosen practically arbitrarily, but the values of the higher 
derivatives are uniquely determined by a and /’(0). 

The case sot=st is also of some interest. This leads to a so-called 
multiplication theorem with 


ft) = LOS OI. (14.2.37) 


This type of multiplication theorem should not be confused with the type 
considered in Section 12.5. The case where fis ἃ linear mapping T of a commutative 
algebra into itself has been subjected to a considerable amount of investigation. 
Such a T is known as a Bourlet operator after C. Bourlet (1897). According to 
G. I. Targonski (1967), if the algebra possesses a unit element and has no 
divisors of zero, there are essentially only three distinct types of Bourlet operators 
with the corresponding functional equations: 


Tluv] = BTu] [Te], (14.2.38) 
TLuv] = 4[uTv + + vTv], (14.2.39) 
Tluv] = uTv + ὑΤιι. (14.2.40) 


The last operator is known as a derivation. 
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EXERCISE 14.2 
1. Verify that the continuous solutions of (14.2.2) to (14.2.4) have the form stated in 
the text. 
2. Construct non-measurable solutions of (14.2.2). 
The next four problems are byproducts of discussions with J. H. B. Kemperman and 
R. C. Metzler. 
3. If fis chosen as a non-measurable solution of (14.2.1) satisfying (14.2.15), show that 
this implies that 
f (sx) = tf), 
where s is any element of Q(p) and ¢ = Z(s) is the corresponding element of Ο(σ). 


4. Show that if (14.2.15) holds for an algebraic irrational number p, then σ᾽ must also be 
algebraic and p and o satisfy the same irreducible algebraic equation. [ Hint: Note 
that there is a polynomial in p which is a rational number and use the result of the 


preceding problem. | 
5. Carry through the construction with p = ./2, σ = —./2. 


6. The following is an example of a non-measurable quadratic generalized polynomial 
in the sense of Section 8.4. Take a Hamel basis {x,} for the reals and form 


Q(x) = 2 24a pXa*pp x= Σ, aXe: 


Find 6Q(x, h) and 67Q(x, h) and verify that the higher variations vanish identically. 
7. With the aid of the result (14.2.30) for the vector case, find the general solution of 
(14.2.25) for matrices. 
8. Verify (14.2.21). 
9. Verify (14.2.24). 
10. Fill in missing details in the derivation of (14.2.30). 


11. If X = ([0, co] and for s > 0, T(s)[f]@ =f + 2), show that {7 (s)} is a semi- 
group of linear bounded transformations and find ||7(s)]. 


12. Let 8 stand for a B-algebra with unit element e and take the mapping fof C into 8 
defined by s > f(s) = (e + p)*® where p is a nilpotent of order k >1. Show by 
operational calculus or otherwise that 


(e + p)* = Ὑ{1) +(5) Pont ( ΕΣ 
py =e .}Ρ ..}Ὁ} εἴ 1)» . 


Show that f(s) f(t) = f(s + t) for all 5, 1. 

13. Find the addition theorem for (1 — z)?. 

14. If G(u,v) = μά — v*)? + 0(1 — εὖ, find a function with the corresponding 
addition theorem. 

15. The function ¢9(z) is that solution of the differential equation 


[w'(z)]° = 4[w(z)]}° — g2,w(z) — 93 
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which becomes infinite at z = 0. Here g, and g; are arbitrary constants, not both 
zero. Use the method of Section 14.1 to determine the nature of the singularity at 
z = 0 and find the principal part of the Laurent expansion (i.e. the terms which 
become infinite at 0). . 


16. Find the continuous solutions of 


x+y 

r( = f(x) + f(y). 
1+ xy 

Verify that the conditions of Theorem 14.2.2 are satisfied for |x| <1, |y| <1. 

[ Hint: There is a transformation of the variables which reduces this equation to the 

form (14.2.1). ] 


17. If the graph of a continuous solution of 


72 + y)] = Σ[.) +f/0)) 
passes through two given points in the plane, show that the solution is unique and 
determine it explicitly. 


For the remaining problems, see Targonski (1967). As basic algebra take the algebra QI 
of polynomials over the complex field. Examples are given of Bourlet operators satisfying 
one of the equations (14.2.38) to (14.2.40). The operators T are defined in terms of a 
fixed element g of YI. 


18. Show that Τί f](x) = f[g(x)] satisfies (14.2.38) with B =1. Such a T is called a 
substitution operator. What is the relation between g and ΤΊ 


19. There may be an element f of 91 such that f/[g(x)] = Af(x). In other words, T has a 
characteristic value Δ with characteristic function f£ The functional equation 


flg@)] = ΧΟ) 
is known as Schréder’s equation after E. Schréder (1871). Show that if g(x) = x?, 
the corresponding Schréder equation can have no solution for 4 #41 and only 
constant solutions for A = 1 in the polynomial algebra 91 but does have solutions in 
a function algebra containing log x. Which solutions? 


20. Show that the multiplication operator T[ f |(x) = g(x)f(x) satisfies (14.2.39) and 
express g in terms of T. 


21. Show that Τί f](x) = g(x) /’(x) satisfies (14.2.40) and express 4 in terms of T. 


14.3 UNIQUENESS THEOREMS 


Around 1960 the most active centers of research in the theory of functional 
equations were located in Hungary with J. Aczél and his pupils at the 
University of Debrecen and the group led by M. Hosszu and E. Vincze at the 
Technical University of Miscolc. To Vincze (1962) we owe a general method of 
solution for certain classes of functional equations based on the implications of 
linear independence. Hosszu will figure briefly later in this section. Here we shall 
merely discuss one of Aczél’s many basic contributions, a uniqueness theorem 
(1964) which is general, easy to state, and easy to prove. It deals with 
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transformations from Εἰ x R' to R‘ defined by a mapping (x, y) > F(x, y) 
where F(x, y) lies strictly between x and y if x # y. Such a mapping F 15 
called intern by Aczél. We recall that a mapping u— H(u) is said to be 
injective (see Section 1.3) if H(u,) = H(u,) implies μὰ = u,. After these pre- 
liminaries we can state and prove 


Theorem 14.3.1. Let a mapping F from R' x R'! to R' be defined and 
continuous for (x, γ)ε (A, B) x (A, B) as well as intern so that 


x<y implies x<F(x,y)<y, x<F(Y,x)<y. (14.3.1) 


Let H(u,v, x,y) be a function of four variables, injective either in u or in v. 
Then the functional equation 


SF, y)] = ALS (x), £(), χ,)}] (14.3.2) 
with the initial conditions 
f(a)=c, f(b) = d, A<a<b<B, (14.3.3) 


has at most one continuous solution. 


Proof. Suppose there were two continuous solutions f,(x) and /,(x) which we 
may assume to be defined in all of (A, B). We want to show that the relations 


f(a) = f,(a), f,(6) = (δ) imply f(x) = f,(%) for all x. This is proved in three 
steps, one for each of the subintervals of (A, B) corresponding to the partition 
points x =aandx=b. 


I. The interval [a,b]. If there should exist a point x9 with a < x9 «ὃ and 
fi (Xo) # f2(X%o), then we determine two points C and D,a<C<x,<D<bJ, 
by the following considerations. Let S, be the set of points x in the interval [a, xo] 
where /; (x) = f(x) and let C be the supremum of x for xe S,;. There exists such 
a C and a<C. Similarly, if S, is the set of points x in (x 9,5] where 
f(x) = f,(x), let D be the infimum of x in S,. Here D « δ. Further, by the 
continuity of f, and /,, 


ΚΘ ἘΞ S(O), f(D) = f(D) (14.3.4) 


hi # κα) if C<x<D. (14.3.5) 
From (14.3.2) and (14.3.4) one obtains 
Ai LF(C, δὴ} = ALAC), fr (D), ©, DI 
= ALS2(©), f2(D), C, D] = faLF(C, D)I. (14.3.6) 


This contradicts (14.3.5) since C < F(C, D) < D, and thus proves the equality 
of f,(x) and f(x) in [a, δ]. 


II. The interval (ὁ, ΒΡ. Denote by E the supremum of the points ¢ in 
(ὁ, B) for which f,(x) = f(x) for all x in [a,t]. If E = B we are through. 


and 
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If E < B, note that there exists a sequence {t,} such that (i) E < t, < B, Vn, 
(ii) t, | E, (ii) f,(@,) τ f(t,), and (iv) a < F(t,, a) < E for infinitely many values 
of n. Only the last property requires some cogitation. If we should have 
F(t,, a) > E for all large values of n, then 

F(E, a) = lim Fi(t,, a) > E, 
and this would contradict the intern property of F. Hence (iv) is valid. We have 
then, for ¢, satisfying (iv), 


hi LF (ths a)] Ξ- ALS; (t,), fi (a), Lns a) 
# AL Π,(1,), 7χ(α), 1,» a] = fol F(t, αὙ]. (14.3.7) 


Here we have assumed that the injective property of H holds with respect to the 
first argument. If it should hold with respect to the second argument instead, it 
suffices to permute ¢, and a in the formulas. Here the inequality is obtained 
since the first arguments /;(¢,) and f,(¢,) differ by virtue of (iii), while the three 
other arguments are the same in both cases. On the other hand, from (iv) it follows 
that 


Si [FG a)] a 7. [F(t,, a)] (14.3.8) 


and this contradicts (14.3.7). Hence we must have E = B and equality holds in 
(ὁ, B). 


III. The interval (A,a). Use the same type of argument as under II. i 


Aczél’s theorem asserts the existence of at most one continuous solution of 
(14.3.2) satisfying a given two-point condition. It does not affirm the existence 
of such a solution. For the existence of a solution it would seem to be 
necessary to assume continuity of H in its four arguments together with rather 
strong supplementary conditions. If a priori information concerning the existence 
of continuous solutions is available, such a solution may be computed to any 
required degree of accuracy by using the intern property of F over and over 
again. Note that the equation (14.3.2) is such that if the values of a solution are 
known at two points x, and x,, x, < x, then the value is also known at an inter- 
mediary point since 


F LF (1, X2)] = HLS (x1), £2), χι; X21]. (14.3.9) 


The points in [x,, x2], which may be reached by repeated application of this 
device, are apt to be dense in the interval and hence determine a continuous 
solution everywhere. 

Illustrations are furnished by Jensen’s equation or the equation of the 
arithmetic means 


FiRs + ηὴ] = 3175) + FM] (14.3.10) 


and by the equation of the geometric means 


f(J/st) = 2570) + OI, O<s, 0<t, (14.3.11) 
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satisfied by 
s— f(s)=as +b, (14.3.12) 


s— f(s) = alogs + b, (14.3.13) 


respectively. Here a and b are arbitrary constants. These are the continuous 
solutions. Judging by Jensen’s equation, there should be no lack of non- 
measurable solutions of equations of the form (14.3.2). For if f is any non- 
measurable solution of (14.2.1), it is also a solution of (14.3.10). 

These special equations for mean values may have served as the starting point 
for Aczél’s discussion of two-point conditions and their importance for functional 
equations. The reader has undoubtedly noticed that (14.3.12) is the equation of a 
straight line, and there is one and only one straight line through two given 
points. 

Aczél also considered other examples from the theory of mean values. The 
following functional equation stems from information theory: 


ΜΞ 220”) _ xf@) + »f OQ) 


(14.3.14) 
x+y x+y 


Here g is a given strictly monotone continuous function and ἢ is its inverse. The 
variables are restricted to 

{(x,y); 0< x,0<y,x+y <I}. 
In the notation of Theorem 14.3.1 we have 


xg(x) + yg(y) 
oie a 


xu + yv 


F(x») = Al om 


) H(u, v, x,y) = 


Here H is strictly increasing both in u and in v and hence injective. To fix the 
ideas, suppose that g is strictly increasing and 0 <x < y. Then 


xg(x) + yg(y) 


g(x) < < g(y). 
x+y 
Since ἢ is also strictly increasing, 
xg(x) + yg 
= Mg] < bP PO) < atgcyyl τ 


Since continuity of F is obvious, all conditions of the theorem are satisfied, so a 
continuous solution satisfying a two-point condition is unique. 

Now solutions can be read off by inspection. It is clear that f(x) = 1 is a 
solution and direct substitution shows that f(x) = g(x) is also a solution. Hence 
the general continuous solution is 


f(x) =a + Bg(x). (14.3.15) 


Note that this solution involves two arbitrary constants the values of which can 
be determined by two endpoint conditions. 
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The condition that F be intern is quite restrictive and would, for instance, 
exclude the consideration of an addition theorem. This defect has been 
remedied, at least in part, through later work by Aczél and Hosszt (1965). They 
have shown that the equation 


FLF(x,y)) = ALS (x), £)I (14.3.16) 


can have at most one continuous solution satisfying a two-point condition 
(14.3.3), if F is continuous and if F and H are both strictly increasing (both 
strictly decreasing) with respect to both variables involved. They also found that if 
F(x, x) # x and if a ¥ F(a,a), then there exists at most one continuous solution 
satisfying the one-point condition f(a) = c, instead of the two-point condition 
(14.3.3). This is interesting because it explains, for instance, the different behavior 
of the very similar Jensen and Cauchy equations, the general solution of the first 
being a two-parameter, that of the second a one-parameter, manifold of 
functions. 

In 1969 Aczél’s theorem was extended to topological vector spaces by 
C. T. Ng. We shall give a brief account of Ng’s work, but to simplify the 
exposition we restrict ourselves to Euclidean spaces only. 

The first notion to be generalized is that of .an intern mapping. Let R™ be 
the Euclidean space of m dimensions over the reals. If x, ye R”, x # y, denote 
the line through x and y by 


Lix,y) = {fy + τῷ -- y); te Εὖ), (14.3.17) 
and the open line segment joining x and y by 
L(x, y) = {fy + τί -- γ);0 « ἐ-Π}. (14.3.18) 


Let E be a closed convex subset of R”. A mapping F from E x E to E is said to 
be intern if, whenever x, ye E with x ¥ y, 


F(x, y) ε L(x, y). (14.3.19) 
The injective mapping of L <x, y> into R’ defined by 
ytt(x—y)-t (14.3.20) 


is a perspectivity p,’, a fortiori, a homeomorphism. Note, further, that if 
f, and f, are continuous mappings from R™ into Κ΄, then the set 


S = ίχ; χε Καὶ, f,(x) = f,(x)} (14.3.21) 


is closed in R™. 
For the proof of Ng’s main theorem two preliminary lemmas are needed 
which are of independent interest. 


Lemma 14.3.1. With E and F as above, suppose that f, and f, are con- 
tinuous mappings of E into R" satisfying 


f[ F(x, y)] = H[f(x), f(y), x, y], ~ (14.3.22) 
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where H is a mapping from R" x R" x E x E into R". Then the set 
S = {x; xe E, f,(x) = f,(x)} (14.3.23) 


is necessarily closed and convex. 


Proof. Since E is closed in R™ and S is closed in E, S is closed in R”. For 
any two distinct points x and y of δ, consider the line L <x, y> and the mapping 
p,’ of the line into R*’. In particular, we fix the attention on the set 
L(x, y) © S which is mapped into a subset of 0 <t<1. Since 5 is closed, this 
subset is open in (0, 1), and if not void is the countable union of open disjoint 
intervals. Consider the latter alternative and let ¢, and f, be the endpoints of one 
of these intervals and set 


Χι Ξ γ- {(Χ --  γ), *,=yt+t,(x— γ). 


These points define ἃ line segment L(x,, Χ2) < L(x, y), no points of which 
belong to S, while the endpoints do belong. Now F(x,, Χ2) is a well-defined point 
of L(x,, x,) by the intern property of F, and it belongs to E, the set of definition 
of f, and f,. Now the functional equation (14.3.22) gives 


f, L[F(x,, X2)] = H[f,(x,), ἢ. (χΧ2), x1, x2] 
= H[f,(x,), f£,(x2), x1, Χ2] = f,[F(x,, x2)]. 


Thus F(x,, x,) belongs to both S and L(x,, x,), contrary to the assumption that 
L(x,, X,) Ὁ 5 is void. This contradiction shows that L(x, y) © S is void, that 
is, L(x, y) < S. Now x and y being arbitrary points of S, it follows that S is 
convex as well as closed. a 


Lemma 14.3.2. Let the set E and the mapping F be as above with 
F continuous in both variables. Let f,, f, be mappings of E into R’, 
again satisfying (14.3.22), though not necessarily continuous. Let 
H: R" x R" x E x E- R" be injective either in the first argument or in the 
second. If f, and f, are identical in some E-neighborhood of a point ace E, then 
f, and f, are identical in their entire domain of definition E. 


Proof. To fix the ideas, suppose that H is injective with respect to the first 
variable. Let be E,b#~a. Define | 
F°(x, a) = x (14.3.24) 


and 
F?*!(x, a) = F[F’(x, a), a] (14.3.25) 


recursively for all p and all χε Ε. Take, in particular, x = b. Then each 
F’(b, a), p # 0, belongs to L(b, a) by the intern property of F. Hence there is a 
number ¢t,, Ὁ < ἢ, <1, such that 


F?(b, a) = a + ¢,(b — a) (14.3.26) 
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and, since F is intern, the sequence {f,} is strictly decreasing to a limit, say 
to > 0. It is claimed that tg = 0 and lim F?(b, a), which clearly exists, equals a. 


p?o 


For F is continuous in its first argument and if 


lim F?(b,a) = a + f)(b — a) # a, 


p> 


then 
lim F?**(b, a) = lim F[F?(b, a), a] = F[lim F?(b, a), a] 


p> po per 
= Fla + τοί — a),a] 4 a+ ἰο(ῦ -- 8). 


This contradiction shows that tp = 0 and lim F?(b, a) = a. 
Let V denote the E-neighborhood of x = a where f, and f, are identical. All 
points F?(b, a) are in E and hence ultimately in V. Suppose k is so large that 


. F*(b, a)e V. 
Then 
f; [F*(b, a)] - f, [F*(b, a)]. 


It is desired to show that 
f, [F’(b, a)] = ΓΕΡῸ, a)], p=0,1,2,---,k —1. (14.3.27) 


Here we use retrogressive induction on p passing from p to p—1. The 
functional equation gives 


ΗΠ ΓΕ *(b, a)], f,(a), ΕΠ ΤΩ, a), a} = £,[F*(b, a)] 
= f,[F*(b, a)] = H{f,[F*~ *(b, a)], f,(a), F*~1(b, a), a}. 
Since H is injective with respect to the first argument, and f, (a) = f,(a), this gives 
f,[F*"*(b, a)] = f,[F* *(b, a)], 


so that (14.3.27) holds for p =k —1. We can then proceed recursively and see 
that (14.3.27) holds for all indicated values of p. In particular, for p = 0 


f, (b) = f(b). 


Since b is arbitrary, the equality holds everywhere in E. ΕΗ 
Combining these two lemmas, we get Ng’s main theorem. 


Theorem 14.3.2. Let E be a closed convex subset of R™ and F an intern 
mapping from E Xx E to E, continuous in both variables. Let H: (u, v, x, y) > 
H(u, v,x,y) be a mapping from R" x R" x E x E into R", injective with 
respect to either u or vy. Consider the functional equation 


f{[ F(x, y)] = H[f(x), f(y), x, y]. (14.3.28) 


If ῖ, and f, are two continuous mappings from E to R" satisfying this equation 
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and if it be known that ἴ, and f, are identical on some subset Ὁ of E whose 
closed convex hull C[D] has a non-void interior as a subset of E, then f, and f, 
are identical on all of E. 


In particular, we have the following 


Corollary. If D contains m+1 points a, such that the vectors a, — a,, 


ἃ: — 94, °°°, Mn+ 1 — 8; are linearly independent, then there exists at most one 
continuous solution of (14.3.28) satisfying m - 1 initial conditions 
fa)=b, jf=l,2,--,mt+1. (14.3.29) 


For m = 1 this becomes Theorem 14.3.1 for any closed subinterval of (A, B). 

Aczél’s definition of an intern mapping given by (14.3.1) is the natural one 
on the line. On the other hand, Ng’s extension to higher dimensions and 
topological vector spaces is only one of several possibilities. J. B. Miller 
(personal communication) has proved an extension of Aczél’s theorem to R? 
based on partial ordering where x < y, x # y, implies x < F(x,y)<y. As 
reported by Aczél and Ng, a general uniqueness theorem has been found by Ng 
for ‘“‘cell-intern’’ mappings in Κ΄. 

One of the earliest instances of a two-point uniqueness theorem is a result due 
to Picard (1890). It occurs in the memoir where he brought forth the method of 
successive approximations. It reads: 


Theorem 14.3.3. Let (x,y) — F(x, y) be continuous in a rectangle |x| < A, 
|y| < B, and strictly increasing as a function of y for each fixed x. Then the 
differential equation 


γ΄ = F(x, y) (14.3.30) 
can have at most one solution satisfying conditions of the form 


y(aj=c, ylb)=d, -A<a<b<A, -B<e, d<B. (143.31) 


Proof. Suppose there were two solutions ἢ and f, with these properties. 
Consider a maximal subinterval (a,, δ.) of (a, δ) in which 


g(x) = fo(x) — fi) 


keeps a constant sign, say g(x) > 0. Here a <a, <b, <6 and 
σία.) = g(b,) = 9 


since the subinterval is maximal. Thus in (a,, δι) 


σ΄ (x) = fo") — fi") = FD ΟἹ] — FD ΟἹ] > 0, 


since F(x, y) is an increasing function of y. This means that g is convex and g’ 
is increasing. Since g(x) >0 for a, <x <b, and g(a,) =0 we must have 
g'(a,) > Ὁ, so g’ > O in (a,, b,) and this forces g(b,) to be positive, which is a 
contradiction. It follows that f,(x) = f,(x) in (a, δι). If (αι. b,) is a proper 
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subinterval of (a, δ), the same type of argument applies to any other subinterval 
in which g keeps a constant sign and hence ultimately to all of (a,b). Thus 
there is at most one solution of (14.3.30) that can satisfy the two-point condition. a 


EXERCISE 14.3 
1. Verify (14.3.13). Why the names attached to equations (14.3.10) and (14.3.11)? 


2. Let p and q be arbitrary fixed positive numbers, g a given continuous and strictly 
monotone function, ἢ its inverse. Solve the equation 


Pf(x) + f(y) 


F(x, y)] = 
SLF(, y)] mee 


F(x, y) = ἢ (ΞΞ + 99 


pt+q 
Verify that the conditions of Theorem 14.3.1 are satisfied so that there is a unique 


continuous solution satisfying a given two-point condition. This equation occurs in 
the theory of averages. 


3. If the m + 1 vectorsa, — a,,a3 — a4,..., 8,4 1 — a, are linearly independent in R”, 
show that so are a, — a,, a3 — 82, ..., A+, — ἃ,,» and vice versa. Geometrical 
interpretation? 


4. The equation of the arithmetic mean for complex variables 


(3G, + 22)J = 41/4) + f(2)] 
has the obvious solution z > f(z) = « + Bz where « and β are arbitrary complex 
numbers. But there are also continuous solutions analogous to (14.2.17). Show that 
x+iy—+a+ Px + yy is a solution for arbitrary «, B, y and that the solution is 
uniquely determined by a three-point condition f (a;) = b,, 7 = 1, 2, 3, provided the 
triangle formed by the three points {a;} in the complex plane is non-degenerate. 


5. The equation of the arithmetic mean in R® 


f[4@ + y)] =4f@ + 10] 
with x, y and the range of f in R* has continuous solutions which are uniquely 
determined by a four-point condition involving a non-degenerate tetrahedron. What 
is the form of the solution? 


6. Apply Ng’s theorem to the equation (13.2.25). 
7. Show that the only solutions of 


f(x +y¥) =[f@) + ΟἹ] 
are two specific constants. 
Find continuous solutions of the following functional equations: 
8. f(x+y) = f(x) 9(y), g given continuous, g(0) = 1. 
9. [70] Ξ- ὰ + ")7ὰ — y). 
10. f(x +y) +f — y) = 2.) f(y). GJ. d’Alembert, 1769.) 
11. [f(x + iv)? = [οὐ] + | FO)’. 
12. A classical method of solving functional equations in two variables is to assume 
differentiability of the functions involved, differentiating with respect to one of the 
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variables, perhaps several times. If now one of the variables is given a special value, 
the resulting differential equation may be solvable. Apply this method to the first 
Cauchy equation and to Nos. 8, 9 and 10 above. 
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15 MEAN VALUES} 


Mean values such as the arithmetic, geometric, and the power means have 
figured in many places of this treatise. In the present chapter we shall discuss 
in some detail a class of mean values referred to as A-averages which includes 
as special cases those just mentioned. Our point of departure will be a set of 
postulates given, independently of each other, by A. N. Kolmogorov and 
M. Nagumo in 1930, and, from a different angle, by B. de Finetti in 1931. 

The A-averages lead to functional equations of the “intern” type. A non- 
constant solution of one of these equations is either unbounded in every interval 
or 15 continuous as well as strictly monotone and defines an A-average. 

This class of mean values has numerous important applications. We shall 
consider some geometric extremal problems and study in some detail two 
classes of set functions—transfinite diameters and CebySev constants. Both were 
originally defined for sets in the complex plane and geometric means, but the 
extensions to arbitrary complete metric spaces and A-averages do not lack interest. 
There are also important connections with generalized potential theory. We shall 
encounter many connections between the subject-matter of this chapter and those 
of Chapters 13 and 14. 

There are eight sections: The postulates; Associated functional equations; 
Remarks on summability; Some geometric extremal problems; The transfinite 
A-diameter; The CebySev constants; Some examples; and Potential theories. 


15.1 THE POSTULATES 


We shall consider an average subject to the following conditions: 


(A,) Let (a, b) be a given real interval. For each natural number n and for each 
set of n numbers x,,X,...,X, belonging to (a,b) there is a number 
A(X1, X2...,X,) in (a,b) called the A-average of these numbers. 


(A) A(x1, X2,...,X,) is a continuous symmetric function of its arguments, 
strictly increasing in each of them. 


(A;) A(x, x, ..., χ) =x. 


(A,) For eachn and each k <n let y = A(x, Xz, ..+5 Xq), then A(X1, Xo, ...ν Xs 
Xt is +> Χη) = ACY, «005s Xa ty «+19 Xn) Where y is repeated k times. 


Τ᾿ Comments by J. Aczél and C. T. Ng have been helpful in the editing of this chapter. 
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There are various immediate consequences of the postulates which will be 
stated as lemmas. 


Lemma 15.1.1. Unless all the x’s are equal, 
min x; < Α(χ,, X2 ...,X,) < max x;. (15.1.1) 
Proof. Use the strictly increasing character of A together with (49). Ηὶ 


Lemma 15.1.2. The average of k sets x,,X2...,X, is the same as the average 
of one set: 


AK isi 5s Menasha nig GS A ee) (15.1.2) 


Proof. Let y denote the right member of the proposed equality. Using (A4) 
repeatedly, we can replace each of the k aggregates x,,x,,...,x, by y repeated 
n times. The result is the average of y repeated kn times and this is y 
by (A;). Hence the two sides of (15.1.2) are equal. 


This gives a convenient method of extending and contracting averages. The 
same device gives the principle of repeated averages. 


Lemma 15.1.3. From a set E of ἢ numbers in (a,b), SAY X14, Xo, ..., Χ,» Select 
k numbers, 1<k<n, and form their average. The average of all the 
averages for k fixed obtainable in this way equals A(x 1, X2, ...5 Xn). 


Proof. The basis for this fact is the preceding lemma together with the 
identity between binomial coefficients 


a(t ἢ -α{[1} (15.1.3) 


We extend the given set E so as to obtain ') copies of each x,. 


The extended set E* has a number of elements equal to the left side of 
(15.1.3). This identity shows that we can separate the elements of E* into 
[1] distinct subsets S,, each being a selection of k elements of E. Averaging 
over E or over E* gives the same result by Lemma 15.1.2. On the other hand, 
in the average over E* we may replace the elements of a subset S, by their 


average yl») = A(S,) repeated k times. Thus the average over E* equals the 
average of the [1 averages A(S,) each repeated k times. By Lemma 15.1.2 
this reduces to the average of the averages as stated. il 
Corollary. With the same notation, for each k <n, 
min yr) < AX ys & Ss max y(">) (15.1.4) 
unless all the x’s are equal. 


The last remark follows from the fact that the y’s are equal iff the x’s are equal. 
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This means that the averaging process is oscillation reducing in the following 
sense. Let S be an arbitrary set of numbers in the interval (a,b). Select n 
distinct numbers from S and form their A-average. The set of all such 
averages involving n numbers is a set S,. For each n set 


a, = inf δ,» b, = sup δ᾽. (15.1.5) 
Then for all 2 > 1 
a,-1 <4, <5, < d,-1, (15.1.6) 
so that 
by — Gn Ξ On—-14 τοῦς, ιν (15.1.7) 


This follows from the corollary. 


EXERCISE 15.1 


— 


. Verify that M,(x) is an A-average. 
2. Same question for the geometric mean with 
Al Xi X35 5 X,) = (χ, Χχ... ee 
3. Same question for the harmonic mean with 
171 1 17)ὴτ-| 
Α(α!, χα, ..., χη) Ξε {--- -- t-te Ἐ -ο 
n |x, X2 x 


4. [Aczél] Let m be a positive odd integer and take (a,b) = (— 00, 00). Define a 
mean value by 
NLA (X45 ..., XL = Ky + XQ™ Hoe + Kye 
Verify that this is an A-average. 
5. [Aczél] Take (a, δ) = (—4n2, 47) and define 


1 
A(x;,...,X,) = arcsin ; sin x + +++ + sin | ’ 
n 


where the arc sine has its principal value between —42 and4z. Verify that this is an 
A-average. 


6. Give a complete proof of Lemma 15.1.1. 
7. Prove the corollary of Lemma 15.1.3. 
8. Prove (15.1.6). 


15.2 ASSOCIATED FUNCTIONAL EQUATIONS 


The postulates obviously do not determine the A-averages uniquely but they do 
lead to simple expressions for A(x,, X2, .--, Xn): 

Our object is to show, for each A-average, the existence of a two-parameter 
family of continuous, strictly monotone functions f such that for each n and each 
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choice of x1, X2,...,.x, in (a,b) we have 


1 n 
STA, Xa, «0-5 X)] = = Σ 70. (15.2.1) 


Conversely, if there exists a continuous strictly monotone function f which 
satisfies (15.2.1) for all n, all x; in (a,b) and for some function A(x,, ...,x,), 
then A(x,,...,X,) necessarily satisfies postulates (A,) to (A,). It will be shown 
that if fis a solution of (15.2.1), then so is af + B for any constants « and β. 
We have here a family of functional equations of which the simplest member is 


FLAX, ¥2)] = HEF (1) + 7 0 2}. (15.2.2) 


This looks like Jensen’s equation (14.3.10), to which it reduces in the case of the 
arithmetic means where 


A(s,t) = 4(s + ft). 


See also Eq. (14.3.11) for the geometric means. The first equation is satisfied by 
f(u) = au + β, the second by f(u) = «alogu + β, as observed in Section 14.3. 

_ There is a close connection between these mean-values and convex functions. 
Thus the technique used in proving Theorem 13.3.1 applies in proving that (15.2.2) 
implies (15.2.1) for all m. Further, just as in the case of convex functions, a 
mean-value function f, i.e. a solution of (15.2.2), is either unbounded on every 
interval or continuous (and strictly monotone). It is obviously the solutions of the 
second kind that are of interest in the theory of averages. 

Since A(s,¢) is an intern transformation in the sense of Aczél, Theorem 
14.3.1 applies to the present situation, but here we need a sharper statement. It 
is not enough that there is at most one continuous solution satisfying a given 
two-point condition; we have to show that there is at /east one such solution. 


Theorem 15.2.1. Let A be an A-average and let f satisfy (15.2.2) for all 
X,X in (a,b). Then f also satisfies (15.2.1) for all n and all x1, Xo, ..., χη, in 
(a, δ). 


Proof. We prove first that (15.2.1) holds for n = 2” and start with m = 2. By 
(A,) and Lemma 15.2.1 


A[X 1, X23, X4] = A[A(Xy, x2), A(x3, X4)], 
so that 
FTA 1, X2, X35 %4)] = HF LAC, X2)] + SLAMS, χ4}]} 
= Ζ[7 (χΧ.) + f(%2) + fxs) + 70 9), 
which is (15.2.1) for n = 4. Complete induction takes care of the case n = 2” for 
m > 2. 


Next, suppose that (15.2.1) holds for a particular value of n. We shall then 
show that it holds for n —1. Set 


y= A(x, XQ5 ees Nag) 
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and note that 


FLA, X2, een ine). a EF (x) + f (2) gale ae f (%n-1) + f(y)]. 


Now by (44) 
A(x, X2, salen saas y) = A(y, ys 9 V0V) <3 y. 
Hence 


£0) = — Lf 6) + fa) + + αν + ON 


or 


Ef 1) + $0) + SOD 


i 


SIA(X 1, X2, oy Nya) ἕω 


as asserted. Since (15.2.1) is known to be true for n equal to a power of 2, 
it follows that the formula is true for all n. This type of argument goes back to 
Cauchy (Analyse algébrique, Paris, 1821). 8 


This means that we can disregard (15.2.1) and concentrate on the functional 
equation (15.2.2). 


Theorem 15.2.2. Let A be an A-average and let f satisfy (15.2.2) for all x,, x2 
in (a,b). If f is bounded in some neighborhood of 5 = 80. a < So «ὃ, then 
f is continuous at s = 80. 


Proof. Suppose that sy + h also belongs to the neighborhood where | f (s)| <M 
for all h with |h| < 7. Then by (A,) and Lemma 15.1.1 


A(S9; So + h) = Sq + p(So, ἢ). (15.2.3) 
Here h > p(Sp, h) is a continuous map and 
sen p(So, h) = sgn h, 0 < |p(So, h)| « |Al. (15.2.4) 


Then by (15.2.2) 
7[4(δ0. 59 + Ἀ}] — f (50) = ALF 0 + ἢ) -- F(So)].- 


We take absolute values and set 
ae If (So + A) — F(So)| = 9(S0). (15.2.5) 
By the continuity of A together with (15.2.3) and (15.2.4) we have also 


ΜΠ Βὰρ IF LA(S0; 50 + 1}] — 5(ῳὁ69)} = δ(ϑο). 
so that 
O(So) = ζδ(δο) or (50) = 9 
since 6(S,) is finite. Thus f(s) is continuous at s = So as asserted. ἢ 


Thus we see that local boundedness implies local continuity. We shall see that 
local unboundedness implies global unboundedness under mild restrictions. 
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Consider the mapping defined by t > A(s, 1) when s is fixed in (a,b). Since 
A is a Strictly increasing function of t and A(s, s) = 5, it is seen that A maps (a, δ) 
onto an interval (s~,s*) where 


α 5 <s<s* <b. (15.2.6) 
We have now 


Theorem 15.2.3. If f satisfies (15.2.2) in (a,b), if f(s) # — οὐ for all s in 
(a,b), and if f is not bounded above in some interval (Ss) — ἢ, δ +) with 
a<Sy ~<Sqo + <b, then for any choice of δι #59 witha<s,—6d< 
δι + ὃ < ὃ, f cannot be bounded above in (s; — ὃ, 5, + δ). Similarly for lower 
unboundedness if f(s) # + 00 for all 5. 


Proof. Consider first the interval (so, 59°) obtained by setting s = sy in (15.2.6). 
Choose a point ¢ in this interval. By assumption there exists a sequence {x,} such 
that (1) 80 -- ἡ < x, < 59 + 9, (2) χ, 7 So, (3) f(x,) > 2n. Then 


FLAG t)] = fn) + Σ7(Π)» 1+ 3fO), 
which goes to infinity with n since f(t) # — oo. Hence 


lim f[A(,, t)] = + 0 
while 

lim A(x,,¢) = A(So, t). 
This shows that f is unbounded above everywhere in (so ,5 9°). If this is the 
interval (a, δ), we are through. If not, let by < ὃ be the least upper bound of the 
values of s for which f is locally unbounded above. Now for any choice of a 
sequence {6,} with 6, | 0, the function f is not bounded above in any one of the 
intervals (by — 6,, δο — 6,41) by assumption. Consider 


7) {AL — ὃ, ξίδο + 5)J} = fo — δὴ) + 57 (bo + δ). 


If b should happen to be +00, we replace 4$(b) + b) by some large positive 
number, before proceeding to the next step. On the one hand, there exists a 
sequence {6,} with ὃ, { 0, such that f(b) — 6,) > 2n; on the other hand, A is 
continuous, so that 


A[bo — ὃ,» (Bo + 5)] > Aldo, ἐ(δο + 5)] > do. 


This shows that a by < ὃ cannot be the least upper bound for the values of s in 
any neighborhood of which fis not bounded above. Thus b, = ὁ. In the other 
direction it is seen that the interval of local unboundedness above extends all the 
way to s=a. In the same way it is proved -that local unboundedness below 
together with f(s) # + οὐ implies global unboundedness below. fj 


Thus it is clear that a solution f of (15.2.2) must be bounded and hence 
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continuous on compact subsets of (a, δ) if it is going to be of any use for the 
theory of A-averages. 


Theorem 15.2.4. A continuous solution of (15.2.2) is either a constant or 
strictly monotone. 


Proof. Aczél’s uniqueness theorem applies to the present situation. Suppose that 
f is a continuous solution of (15.2.2) and that there exist 5, and 52, 5; 4 82: 
such that f(s,) = 3. (52) = γε. Then f(s) = γι is a solution satisfying the given 
two-point condition. Since the solution is unique we are dealing with a constant 
solution of (15.2.2). If no such s,, 5) exist, then f, being continuous, is 
strictly monotone. fj 


We come now to the existence theorem. 


Theorem 15.2.5. Given four real numbers So, 851. Yo, γι With a < 80 «5. «ὃ, 
Yo < yy. Then there exists a unique continuous, strictly increasing solution 
of (15.2.2) defined for a < 5 « ὃ such that 


7 (So) = Yo; 70) = 1. | (15.2.7) 


Proof. The emphasis here is on “‘there exists’; since Aczél’s uniqueness theorem 
applies, we know in advance that a continuous solution is unique, if it satisfies 
(15.2.7), and yo < y, implies that it is strictly increasing. In order to obtain a 
strictly decreasing solution instead, we must assume yo > )}1. 

In this discussion the functional equation (15.2.2) will be understood in the 
following sense. If s and ¢ are any two points in (a, δ), if two real numbers /(s) 
and f(t) are uniquely defined, then f(u) exists also for 


u=A(s,t) and f(u) =4[0f(s) + f@). (15.2.8) 


By repeated application of this principle it is seen that, if f is defined for 
Ἡ = Sq and u = 5,, then f(u) exists and is well defined in a countable subset S of 
[So, 5,] and this set is mapped in a 1-1 manner onto a subset Y of [yo, y,]. We 
proceed to a description of these two sets. 

This is most conveniently given in terms of an auxiliary set D consisting of 
all the dyadic rationals in [0, 1]: 


d=a,+a,2°-1!+a,2°>7 ++: +4,2" (15.2.9) 


where a; is 0 or 1 and ay = 1 iff all other a’s are zero. To each such number d we 
assign a number s, in [59,5,] where the indexing is determined by the 
composition rule 

S, = A(S,, 53) <> y = 4(a + B) (15.2.10) 


and a, B, ye ἢ. Here we begin with « = 0, f =1 and proceed by successive 
averaging. At the nth stage of the process we have labeled all those points of S 
for which the index is of the form ὦ = k2™" with k = 0,1,2,...,2”. If here k is 


444 MEAN VALUES 15.2 


even, the point in question has already been labeled but it is clear that the 
index is unchanged. We shall state or prove various properties of the three sets 
D, S, Y. We start with Y. 


Lemma 15.2.1. If dé D, then 
FS (Sa) = dy; + U— d)yo. (15.2.11) 


Proof. We use induction on n in the expression (15.2.9) for d. The formula is 
clearly true for d= 0, 4, 1, ie. for n=1. Suppose it is true for πὶ = m and 
consider a point s, which was indexed at the (m + 1)th stage. Such a d is of the 
form 

d= (2p - 1)2 5 =4[p2-" + (p+ 1)2-"] = 4a + B) 


with obvious notation. Since (15.2.11) is true for n = m by assumption, we have 
F (Sa) = 404 (Sa) + £(55)] 


=4{p2-"y, + [1—2°-"p]yo + (p +127", ἘΠῚ — (p+ 1)2-"] yo} 
(2p - 12. π Π ν, ἘΠῚ -ἰὀᾷΔω Ἐ1)2. 5 "Jy = dy, + ( -- d) yo. 


Thus the formula holds also for n = m+1 and hence for all deD. a 


I 


An obvious corollary is that Y is dense in [ yo, y,]. It is clear that D is dense 
in [0, 1]. 


Lemma 15.2.2. 5. is dense in [So, S, ]. 


Proof. If S were not dense in [S9, 5,] there would exist an interval (a), By) no 
point of which belongs to S, while the endpoints are at least in the closure of S. 
If a and fy are both in S, then so is A(a%, Bo), which lies in (a, By). This is a 
contradiction. If a) ε 5, βοε 8, the continuity of A would give 


ἀρ < A(a, Bo) < Bo 


and there is still a contradiction. The remaining possibilities are disposed of in 
the same manner. It follows that S is dense in [50.581]. Bi 


End of Proof of Theorem 15.2.5. Αἴ this stage we have f defined for all s in the 
dense set δ. This definition is to be completed in such a way that fis shown to 
exist as a continuous strictly increasing function, first for all s in [50.854] and 
then in the rest of the interval (a,b). The procedure in the first case is 
obvious. If sp is not in δ, it belongs to S and we can find a sequence {s,} which 
converges to 80 and such that 5. Ε δ, for all k. Here each δὰ is an sy, where ὦ. € D 
and the sequence {d,} converges to a limit ἀρε ἢ. Here we have used the fact 
that the mapping of D onto S is (1, 1), continuous, and monotone. But now the 
mapping of S onto Υ 15 also continuous and equi-monotone. To s = s, corresponds 
y = y, where 

Ve = Κ( δ = AV, + ( -- d) Vo. (15.2.12) 
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Hence we can define 
f (So) = ae F (5) = ὦν. + ( -- do) ¥o- (15.2.13) 


Since the mapping of D onto Y is (1, 1), continuous, and monotone, it follows 
that d, is uniquely determined by 50 and does not depend upon the choice of the 
particular sequence {s,} as long as it belongs to S and converges to so. The 
obvious equality 


767) — f(s") = @ -- 2°) 1 — Yo) (15.2.14) 


will help the reader to clarify this important point. In this manner / 15 
defined everywhere in [s9,5,] aS a continuous, strictly increasing solution of the 
functional equation (15.2.2) which satisfies the given two-point condition and it is 
the only solution with these properties. 

We still have to extend the definition of f to the rest of the interval (a,b). It 
is enough to indicate how this is done for the interval (s,, δ). The only tool at our 
disposal is the functional equation 


STAG, 0) = 20 6) + FOI), 


which the proposed solution is bound to satisfy. The equation links the values 
of f for three values of u, namely u = 5, u = t, u = A(s,t). If two of these values 
are known, then the third is uniquely determined. We now know the value f (u) 
for any u in [59, 5,]. We have to choose two values of u in [50 5, ] such that the 
third associated value lies in (s,,b). An advantageous choice would be ¢ and 5 
such that 

So<t<sy<s<b, Α(5. 1) = δι. (15.2.15) 


Such a choice, however, is not always possible. What is desired is to solve the 
equation 
A(s,t) = Sy (15.2.16) 


for 5, given the value of ¢ in [s9,5,]. It is a case of the implicit function theorem 
but under weaker assumptions than those of Theorem 6.1.1. On the other hand, 
A(s,t) has rather special properties which guarantee the existence of a unique 
solution for t close to 5s. 

We return to (15.2.6), interchanging the roles of s and ¢, and note that the 
mapping s > A(s, t), for fixed t, of (t,b) onto (t,t) varies continuously with ἢ. 
For t= 5) we have 17 =s,° >s,. This means that for a small 6 >0 and 
0 <s, —t <6 there is an ε = ε(δὴ) such that 0 <s," —t* « ε. This says that 
for such a ¢ the function A(s,t) maps the interval s,; < s < ὦ onto an interval 
containing the point vu = δι΄ — « and, a fortiori, the point u=s,. For A(s, ἢ) 
increases from A(s,,t)<s, to t’ >s," —é> 5, as s goes from ¢ to ὁ. It 
follows that, for s, — 1 small positive, Eq. (15.2.16) has a unique solution s = s,(t) 
such that 


A[s,(t), 1] = sy. (15.2.17) 
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We can then define 


71[5.(η] = 2f(s,) — f(@). (15.2.18) 
If f(t) = dy, + ( -- ἀ)γο 
fin@] = ὦ -- 4)γ. -- ( -- dy = ayy, + ( -- d1)¥o (15.2.19) 


where 1 « ὦ, «2. The function 5. (7) is again continuous and strictly monotone 
on its interval of definition, say [7,,5,]. Here ἔς is the least value of t > sy such 
that s,;(¢) < δ. Thus fhas been defined on some interval (s,,5,(¢,)). If this is not 
all of (s,, δ), the argument is repeated. Instead of (15.2.16) we use an equation 


A(s, 1) = 5. (15.2.20) 


where 5) equals s,(t,) if f is defined for this value, otherwise some smaller value. 
Actually, if s;(t;) < 5 we can take 5) = 5,(t,), for fis necessarily bounded to the 
left of this point, since otherwise it would be unbounded in the whole interval of 
definition, as we see by an adaptation of the argument used in proving 
Theorem 15.2.3. It follows that fis at least right-continuous at this point and the 
right hand limit can be taken as the definition of f for u = s,(t,). We can then 
carry through the same argument as above and obtain f defined in a larger 
interval. In the same way we extend to the left into (a, 50). 
Suppose that by < b is the least upper bound for the values of s for which 
J (ὦ) may be defined in this manner. Again we see that f must be bounded to the 
left of s = by and tend to a finite limit as 5 increases to by. But then we can solve 
the equation 
A(s,t) = bo (15.2.21) 


for s in terms of t for t < by but close to by. Here s(t) > by and we can set 
FIs(t)] = 2f (Go) — "ἡ. 
This shows that we must have by) = Ὁ and in the same manner it is seen that 
the greatest lower bound for the values of s for which f is definable equals a. a 
There is a converse of this result. 


Theorem 15.2.6. Let s— f(s) be a continuous, strictly monotone mapping 
defined on an interval (a,b). Let g be the inverse function of f so that 


gLf(5)1 = flo(s)] = 5, V se (a, 6). 
Then 12 
A(S15 825 -++55S,) = a|— Σ τῳ] (15.2.22) 


is defined for all n and all 5; in (a,b). It is, moreover, a mean value satisfying 
postulates (A,) to (A4). 


Proof. To fix the ideas, suppose that / is strictly increasing. Then, for a given 
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ἢ and a choice of the 5; as indicated, the quantity 


ee) (15.2.23) 
HW j=1 


belongs to the range of fand hence to the domain of g. Thus A(...) is well defined. 
Moreover, the sum is an increasing continuous function of each of its arguments 
since f has these properties. Since g is continuous and strictly increasing, 
A(...,5;,...) 15. a continuous strictly increasing function of 5;. Thus (A,) and 
(A,) are satisfied. Postulate (A;) is obviously true and (A,) is a trivial consequence 
of the identity 


n 


4 (kL) τ + FD] + οι + + L6)| 


-o|—¥ Τὼ] 


for we have 
1 n k 
FAQ, oo Pee πο ΧΩ] =—[kFO)D + Y Le), KFO)= Y Fp, 
So > 
whence (A,) follows. ἢ | 
Among the possible choices for f we note the following: 


(1) f(s) = s*, « # 0, which gives the power means 
1 n 1/a 
M,(x) = |— > if . (15.2.24) 
nN )Ξ1 
Normally we have to take (qa, δ) = (0, 00) but if ἃ is an odd positive 
integer, (— οὐ, + 00) is admissible. 
(2) « = —1, f(s) =1/s, s > 0, gives the harmonic mean. 


(3) The case « = Ο is clearly excluded in (15.2.24). Since f is not uniquely 
determined by the mean we may replace f by cf - d, choosing c and d in 
such a manner that a limit exists for «+0. The desired choice is 


= [5 — 1]. 
α 


The limit function log 5 defines the geometric mean. 


(4) A choice which has not figured above is f(s) = exp (as) with « # 0. This 
gives 


A(s, t) = —log 4[exp (as) + exp (at)], (15.2.25) 


which is basic in information theory. 
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Mean values involving the sine, the tangent, and their inverses figure in various 
exercises above and below. Since there is an ample supply of continuous strictly 
monotone functions, the reader can construct mean values ad lib. 


EXERCISE 15.2 


1. Some mean values are translation-invariant; i.e. if each s; is replaced by 5; + a fora 
fixed a, then the average is changed by the same amount: 


A(s, + a, 52 + Ω, “9.69 δὴ + a) = A(s,, 52, veg Sn) + a. 


Show that the arithmetic mean, f(s) = 5, and the means defined by (15.2.25) have 
this property. 


2. According to M. Nagumo, these are the only translation invariant means. Prove. 
[ Hint: It is suggested to show that f(s + a) = g(a) f(s) + h(a) with g(a) # 0, Va. 
From this conclude that fand g have continuous derivatives and f’(s) = af(s) + B 
with constant a and f. Integration proves the assertion. ] 


3. Some mean values are homogeneous; i.e. 
A(aS1, 452» ..., AS,) = AA(S1, Sz, ...5 Sy) 
Show that the power means and the geometric mean have this property. 


4. [Nagumo] Show that these are the only homogeneous means satisfying the postu- 
lates. [ Hint: Find a direct proof or reduce to Problem 2 by setting s = e’.] 


5. The method used in proving Theorem 15.2.2 can be used to throw some light on 
existence questions for some equations of the Aczél—Hosszt type (14.3.16). Suppose 
that A(s, 1) is a mean value in the sense used here and consider the equation 


FLAG, D] = 41/6) + ΧΩ] + x LS (s) — f(t)" 


where the c’s are non-negative and the series Σ c,u?* defines an entire function. Let f 
be a locally bounded solution whose oscillation 


lim sup | f(s + A) — f(s)| = w(s) 
h->0 


satisfies the (highly restrictive) condition 
CO 
Σ eLo(s)]}*"* < 4. 
k=1 


Show that fis necessarily continuous at all points where this condition holds. 
6. Determine 5 (1) and s* (t) when f(s) = arctan s. How do they vary with 12 
7. If A(s, t) is an admissible mean value and if 


5. (Ὁ) =limA(s,t), ss * (4) = lim A(s, 1), 
sla stb 


show that these mappings, if finite, are continuous, strictly monotone functions of ἢ. 
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8. Same question for s(t, 59), the solution of 
A(s, t) = s° with s(s°, s°) = 8°. 
9. In the proof of Theorem 15.2.5 carry out the extension of fto the interval (a, so). 


10. Suppose that s -- f(s) defines an admissible mean value and that / is analytic, 
holomorphic at every point s = sy of (a,b). Prove or disprove the assertion that 
A(s, t) is also analytic. | 

11. Suppose instead that A(s, 1) is given and is analytic; i.e. for every point (So, fo) with 
real coordinates in the interval (a, δ) there is an absolutely convergent double series 
expansion in terms of powers of s — so and ¢ — fy. Discuss the analyticity of the 
corresponding function αὶ 


12. In information theory the mapping (s, t) > inf (s, ἢ) plays an important role. Set 
A(S;, 5, ....» δι) = infs;. This is not a mean value in the sense used here but becomes 
acceptable if (A,) is slightly weakened. How should this be done? Verify that the 
other postulates hold. 


13. If A(s, 1) is defined as in the preceding problem, what is the corresponding function 
f? What two-point conditions are admissible? Is (5,1) > A(s,¢) an intern 
transformation? 


14. What would be the answer to Problem 10 in the present case? 


15.3 REMARKS ON SUMMABILITY 


Averaging processes have two important applications: to the smoothing of 
statistical data and to the summation of non-convergent infinite series or, equiva- 
lently, to the limitation of infinite sequences. We shall make some remarks on 
the second application. Here the preservation of existing limits is essential. 


Theorem 15.3.1. An averaging process A satisfying (A,) to (A4) is limit 
preserving, i.e. if {x,} is a convergent sequence of numbers in the interval 
(a, b), lim x, = Yo, Where yo # a and b, then 


no 


lim A(X1, X25 «+09 Xn) = γο. (15.3.1) 


n> oO 


For the proof we need the following. 


Lemma 15.3.1. The average of k numbers c and n numbers d converges to d if 
n goes to infinity in such a manner that kin — 0. 


Proof. This is an immediate consequence of the existence of a continuous strictly 
monotone function / satisfying (15.2.1). If A,,, is the average of k entries c and 
n entries d we have 


k n 
f (Ann) = rma Ῥ mgr Δ0. (15.3.2) 
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As n— © while k/n > 0, the right hand side goes to f(d). If g is the existing 
continuous strictly monotone inverse of f, then 


k n 
tim Ay, = limg | —— 7 + fo 


= glim f(A, = gL 7 (4}] = 4d 


as asserted. i 


Proof of Theorem 15.3.1. Since f does not necessarily exist or be continuous at 
either end of the interval we have assumed that a < yy < b. Suppose now that 
for all j we have a < « < x; < B < b and that for a given ¢ > 0 it is known that 
aA<Vo-ELX, SC Vote<3 for j>k. Then by (A,) and Lemma 15.1.1 
A(X, X2, ..., χη) lies between 


A(Q, α, .... ἃ, Vo — & ---> Vo — δ) 
and 


A(B, B, “599 B, Yo τὰ δ, ...2 Yo a δ). 


Here « and β occur k times each while yy — 8 and yo + δ occur n -- k times each. 
By the lemma, the first of these expressions converges to γὺ — 8, the second to 
yo + 8 ἃ8 ἢ - οὐ. Since ὃ is arbitrary, the assertion follows. Jj 


Repeated smoothing of data is a well-known device in statistics and repeated 
averaging is also used in the theory of summability. This device was 
introduced by Otto Holder (of the Hélder inequality) in 1882 for the arithmetic 
means, f(s) = 5. More generally, let A, and A, be two averages, distinct or not, 
and form 


LA, * Az] (%1, Χ, ...» Xn) 
= A ,[A2(x;), A,(x1, X2); Ὁ Αγ)(.ι, N29 +209 Xy)1]. (15.3.3) 


We have then what is known as a consistency theorem: 


Theorem 15.3.2. If A, and A, satisfy the postulates and if {x,} is a sequence 


such that 
lim Α,(χ,. X45 . 5.9 X,) -- Yo> (15.3.4) 
then 
lim [A, * A2] (x1, X2, ..-, Xn) = Vo: (15.3.5) 


Proof. Combine (15.3.1) and (15.3.3). | 


Since, in general, A, -A, # Α;. Α,, a sequence {x,} which has an [A, -A,]- 
limit need not have an [A,-A, ]-limit, and, if both exist, they need not be equal, 
that is, A, and A, need not be consistent in their common domain of 
applicability. 

There is no reason to stop with two “factors” in the composition of 
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averages. Except for Hélder means, not much use has been made of this device. 
The composition is a fairly clumsy device and the restriction to a common 
interval (a, b) for the two processes may cause difficulties. 


EXERCISE 15.3 


1. Give some instances where Theorem 15.3.2.remains valid for γὺ = a or ὁ. 


2. Suppose it is known that x., — c, X24, — d where c # d. Show that, nevertheless, 
the sequence {x,} has an A-limit the value of which depends upon the choice of A. 


3. Suppose that A, and A, are two processes satisfying the postulates and that 
a<s<A,(s,t) < A,(s,t) < t for all 5, ¢ in (α, δ). Given two numbers c and ὦ, 
a<ce<d<b, form the sequences {c,} and {d,} where cg = c, dy = d and 


Cy, = Ay (C,-1,4,-1), Gy = Az(C,-1,4,-1), 1 > 0. 
Show that the first sequence is increasing, the second decreasing and they have the 
same limit. For A,(s,t) = ./st, A,(s, t) = 4(s + 2), the limit is the arithmetico- 
geometric mean of Gauss. 


4. Give an example of a sequence of positive numbers for which the geometric and the 
arithmetic means have different limits. 


5. It was shown by T. Carleman (1923) that 


oO οο 
X (ayaz...a,'" <e>d a, 
n=1 


n=] 
if the right member is a convergent series with positive terms. Is the series 


οΌ 


1 
Σ — (ay + ay +++ +) 


n=1 


convergent under such circumstances? 


15.4 SOME GEOMETRIC EXTREMAL PROBLEMS 


We shall study a geometric extremal problem whose solution is essential for the 
discussion in Section 15.7. The point of departure is an innocent-looking 
problem in maxima and minima with side conditions. A triangle is inscribed in 
the unit circle, the lengths of the sides being s,, 52, s3;. What triangle will 
maximize a given symmetric function of s,, 5,, 53? The solution is often, but not 
always, given by the equilateral triangle. There may exist improper solutions 
formed by degenerate triangles, i.e. double diameters. Here the methods of the 
calculus carry us a long way. Now consider the corresponding problem in three 
dimensions: a tetrahedron is inscribed in the unit sphere, what configuration will . 
maximize a given symmetric function of the lengths of the edges? Here the methods 
of the calculus are apt to fail right away. The problem generalizes to higher 
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dimensions. In three space we could, of course, also pose the problem for the faces 
of the tetrahedron instead of for the edges. 

To return to edges and higher dimensions, let n points P,, P,, ..., P, be given 
on the unit sphere in R"~! 


n-1 
yxy Hl. (15.4.1) 
j=i1 


Join the points by line segments P,P, for 1<j<k <n. These are the edges of 
an n-simplex. It is said to be regular if all the edges have the same length. We 
may ask: For what n-simplex is the sum of the lengths of the edges a 
maximum? Instead of maximizing the sum of the lengths, we could maximize 
the arithmetic mean of the lengths and, for that matter, any average of the type 
described in this chapter. Here the outcome may be expected to depend upon 
what averaging process is used. The answer is not obvious and already in the 
plane unexpected things will happen. For the equilateral triangle inscribed in the 
unit circle the length of the edge is ./3 and any averaging process satisfying (A3) 
would give the same value. If any other triangle should give a larger average, 
considerations of symmetry show that the triangle must be degenerate. For a 
double diameter we obtain A(2, 2,0), if this expression makes sense, and the 
problem reduces to the question of for what A-averages is 


A(2, 2,0) > «32 (15.4.2) 


Take, in particular, A =M,, p > 0, the pth power mean. Here the left member 
of (15.4.2) is 
A(2, 2,0) = 24)" 


and for large values of p this is arbitrarily close to 2 > /3. 
We shall give a solution of this simplicial problem in any Euclidean space for 
a large class of averages. 


Theorem 15.4.1. The regular n-simplex in the unit sphere in R"~\ maximizes 
the A-average of the lengths of the edges of the simplex provided the 
function f defining A satisfies one of the following four conditions: 

(1) f(s) =s?, p <2, p 90; 

(2) f(s) = logs; 

(3) f(s) is strictly convex and decreasing; 

(4) f(s) is strictly concave and increasing. 

The solution is unique except for p=2. If f(s) =s? and p> 2, then the 


regular simplex does not give a maximum, and if n is even, the maximum is 
furnished by a multiple diameter. 


The proof will be given in several stages. The point of departure is the 
generalized Parallelogram Law, formula (2.5.7). This gives the solution for 


f (s) = 52. 
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We recall that if X is any product space with the usual conventions and 
notation, then for any choice of n vectors x,, X,,..., X,, distinct or not, we have 


n 
ΣΧ; 
1 


In particular, if the x,’s are unit vectors, we get the inequality 


2 
Σ IIx; — x,|]7 + = pe IIx, {]?. (15.4.3) 


1<j<k<n 


Σ᾽ |ix; — x ||? <n? (15.4.4) 


1<jsk<n 
with equality iff 
x, = 0, (15.4.5) 


m[]= 


1.6. the centroid of the endpoints of the vectors falls at the origin. 

At this stage there is no relation between the number n of the vectors and 
the dimension of the space. Suppose now that the x,’s are unit vectors in 
R"~*. Then the summands on the left, $n(m — 1) in number, are the squares of 
the lengths of the edges of the n-simplex and we thus obtain 


Lemma 15.4.1. If f(s) = s*, the regular n-simplex maximizes the M,-average 
of the lengths of the sides of the inscribed simplex. The maximizing con- 
figuration is unique iff n = 3. 


To see the truth of the last remark we observe that if three unit vectors 
X;, X2, X3 are located in the plane and subjected to the condition 


X, + x, + x3, = 0, 


then they determine the vertices of an equilateral triangle inscribed in the unit 
circle. On the other hand, if n > 3, then the condition ΣΧ, = 0 does not 
determine the x’s up to a rotation. Thus besides the regular simplex, there are 
infinitely many simplices for which the sum of the squares of the lengths of the 
edges reaches the maximum value n”. For future reference we note that the 
length of one of the edges in a regular n-simplex is 


v2( - - ; ) (15.4.6 


if the simplex is inscribed in the unit sphere in R"~*. This expression tends to 
the limit ./2 as n > οὐ, a fact important for the following. 

This settles the mean square case. The remaining cases of Theorem 15.4.1 
are essentially an exercise in the use of Hélder’s inequality and the properties 
of convex functions. 

We start with the power means M, with O<p<2. Let n points 
P1>P2,+++>P, be given on the unit sphere (15.4.1) in R"~* and let d,, denote the 
length of the line segment joining p; and p,. These are the edges of the 
corresponding n-simplex and our problem is to maximize M,(d,,). Now for fixed 
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entries the power mean is an increasing function of the order p. Hence 


= n by 
M, (dj) < Μ,(ά κι) < max M,(dy,) = /2 ( ΠῚ ) ᾿ (15.4.7) 


Here equality holds in the first place iff all d,, are equal, in the second place 
iff the centroid of the vertices is at the origin. Both conditions hold simul- 
taneously iff the simplex is regular. Thus we have proved 


Lemma 15.4.2. For 0 < p < 2 the p-th mean of the lengths of the edges of an 
n-simplex inscribed in the unit sphere in R"~* is a maximum iff the simplex is 
regular. 


Again the maximum is given by (15.4.6) and it is now reached for a con- 
figuration which is unique up to a rotation about the origin. 

Next we take the power means with 2 < p. Here the discussion is based upon 
the observation that s? « s* for 0 < s <1 with equality iff s is either 0 or 1. This 
leads to the following sequence of inequalities: 


Σ w=? Σ 6 40; < PY Edy = 2? Gy)? < PAW, (15.4.8) 
j< j< jJ< Jj< 


Note that the edges of the simplex have a length at most equal to 2. Thus dividing 
by 2 in the first step ensures that 0 < $d, <1, so that the inequality s? < 52 can be 
used. Now equality holds in the first doubtful place iff each dj, is either 0 or 2 and 
in the second place iff the centroid is at the origin. If 15 even, n = 2m, both 
conditions may be satisfied by letting m vertices coincide at one point of the sphere 
and the remaining m vertices coincide at the antipode. In other words, choose m 
vectors x; equal to x and the other m vectors equal to —x where x is a unit vector. 
If we move the x,’s, one pair at a time, one to x and the other to — x, we see that we 
end up with m? distances ἄπ equal to 2 and the remaining m* — m distances equal 
to 0. For this choice we have equality all the way through in (15.2.8). It follows 
that the maximum value of the pth mean, p > 2, of the lengths of the edges of the 
inscribed n-simplex is 


n 1/p 
2 ae ( (15.4.9) 
n— 1 

and this is reached for n even by a degenerate n-simplex, a multiple diameter. Since 
this supremum exceeds (15.4.6) for p > 2, the regular simplex cannot maximize the 
M,-power means. In particular, this holds for the tetrahedron which served as our 
point of departure. 

We turn now to case (3): f strictly convex and decreasing. Now if fis strictly 
convex, 


m ] m 
ft (ς 2 5) i 2p (s;) (15.4.10) 


unless all the s,’s are equal. As above, let dj, denote the length of the edge P,P, and 
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let dy stand for the average of the ἄπ as defined by Κὶ Here N = πᾷ — 1) and 


tn) = Yd) > L(Y 4)» 3}. saan 


To get equality throughout, the centroid of the vertices must be at the origin and 
all d,, must be equal. Note that fis both strictly convex and strictly decreasing. 
We have also used the known value of the maximum for M, which enters in the last 
inequality in (15.4.11) as well as in (15.4.6). It follows that for each n > 3 the 
maximizing configuration is given by the regular n-simplex, which is unique up to a 
rotation. 

In particular, this result applies to f(s) = s? for p < 0. It also applies to the 
A-average defined by f(s) = cot (42s) where a = 0 and ὃ = 2. 

In the same manner we prove case (4). This includes f(s) = logs and 
f(s) =s? for 0<p<l. Here (a,b) = (0, οὐ). Another possibility 15 
f(s) = sin (fas), (a, δ) = (0, 2). 


EXERCISE 15.4 


1. Justify the discussion of (15.4.2) for the pth power means. Determine the critical 
value of p, p = po, beyond which the equilateral triangle fails to maximize the 
average. 


2. The calculus problem of finding maxima and minima for the pth means of the lengths 
of the sides in a triangle inscribed in the unit circle reduces to finding maxima and 
minima of | 

F(x,y) = [sinx]? + [siny]? + [sin(@ + y)]?, 
where x, y, x + y are in [0,2]. Verify! Show that x = y = ἐπ (i.e. the equilateral 
triangle) gives a local maximum of F(x, y) for0 < p < 4,a local minimum for 4 < p 
and neither a maximum nor a minimum for p = 4. 


3. Verify (15.4.3). 
4. Write out a proof for case (4) of Theorem 15.4.1. 


5. Let f be strictly increasing and strictly convex. Let d, denote the maximum of the 
corresponding A-average for the n-simplex inscribed in the unit sphere. It is supposed 
that fis defined for 5 = 0 and s = 2 with 0 < f(0) < f(2) < oo. Show that 

/2 <liminfd, < limsupd, < 2. 
no now 
[ Hint: Use the fact that the graph of f for 0 < s < 2 lies below the chord joining 
(0, f(0)) with (2, f(2)). Actually lim d, exists, as will be shown later.] 


6. Take f(s) = (1/s) tan cs with 0 < c < $x. Show that f defines an A-average which 
satisfies the conditions of the preceding problem. Make plausible that for n = 2m 
the completely degenerate n-simplex consisting of just a multiple diameter gives the 
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maximum value for the average. Show also that if lim d,,, = do, then dy is the root 
of the equation 
tancx = 4[c + 4tan2c]x 


in the interval (0,2). For c = 37, dy = 1.766 approximately, i.e. nearer to 2 than 
to ./ 2, the limits given in Problem 5. 


7. The function s > f(s) = tan (47s) defines an A-average in (—2, 2). In this case the 
A-average of the lengths of the edges of the n-simplex has no proper maximum, but if 
the supremum is denoted by d,, show that d, = 2 for all n. This is also the upper 
bound for the topological diameter. 


15.5 THE TRANSFINITE A-DIAMETER 


A beautiful application of mean values is furnished by the notion of the transfinite 
diameter of a set. The original concept was introduced in 1923 by Mihaly Fekete 
(born in Hungary 1886, died in Jerusalem 1957) in connection with a question 
involving algebraic numbers. The precise nature of this problem has no bearing on 
the following discussion. 

Given a bounded closed set E in the complex plane, Fekete wanted a measure 
of how far apart n points of E could get on the average. He was working with 
discriminants, i.e. expressions of the form 


(Z; " Ζι) 
1<j<k<n 


so it was natural for him to use the geometric mean for averaging, that is, 


1/N 
ΙΖ; — alt N = ὁ πίη — 1). (15.5.1) 
1<j<kn 
Denote the maximum of this expression for any choice of n points in E by d,(E). 
Fekete could show that the sequence {d,(E)} is decreasing, so that 
d)(E) = lim d,(E) (15.5.2) 

exists. He called this number the transfinite diameter of E. It is obviously at most 
equal to the topological diameter d(E) and usually considerably less. Thus for a 
circular disk it equals the radius, for a line segment it is a quarter of the length. 
Fekete found that this concept gave him the solution of his algebraic problem, but 
he soon realized that he had got hold of something much more significant, and he 
spent the rest of his active life as a mathematician on investigations of the trans- 
finite diameter. Thus this concept has a bearing on the so-called CebySev constant 
(see next section), on the /ogarithmic equilibrium potential of E and, if E is simply 
connected, on the exterior conformal mapping radius of E. Here work by G. Szegé 
also played an important role. For the latter concepts, see Section 15.8. 

In 1931 Polya and Szegé considered the corresponding problem in three 
dimensions. Since they wanted to preserve the contact with potential theory 
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(Newtonian in this case), they found that the averaging should be based not on the 
geometric but rather on the harmonic mean. The former corresponds to 
f(s) = logs, the latter to s~*. These are the simplest logarithmic and Newtonian 
potential functions respectively, s being the distance from a fixed to a variable point 
in the space. Pédlya and Szegé also considered f(s) = s? in two and three 
dimensions. 

It is clear that a transfinite diameter is definable for an arbitrary metric space 
and arbitrary A-averages. Much work along such lines has been done by F. Leja 
(Krakow). We shall consider this problem. Actually the connection with potential 
theory of a generalized kind is preserved also in this case. 

Let X be a complete metric space, let E be a bounded infinite point set in X, and 
let A be an averaging process, satisfying the postulates, defined by a continuous 
strictly monotone function /(s). 

We take n points P,, P2, ..., P, of E, note the distances 


4(Ρ;;: P,) — dts J = k, (15.5.3) 


and take the A-average of the d,, for 1 < j < k < n, which is denoted by A(d;,) for 
short. This is a positive number, not exceeding the topological diameter d(E). Set 


d,(E) = sup A(d,) (15.5.4) 


when the ἢ points P; range over E. Here and in the following we restrict ourselves 
to A-averages for which the formulas make sense; a sufficient but not necessary 
condition is that (a, δ) = (0, οὐ). 


Lemma 15.5.1. The sequence {d,(E)} is non-increasing. We set 


lim d,(E) = do(E; A). (15.5.5) 
Proof. The properties of the average A will be used. For a given 7 and a given 
é > 0 we can find ἡ + 1 points Ο; in E with 


dA(Q;, Q,) — Ox 


A(5,,) > dys 1(E) — € (15.5.6) 


by the definition of the supremum. Here there are 4n(m + 1) distances δι. We can 
take n — 1 sets of these distances and average, obtaining the same result by Lemma 
15.1.2. These 4(n —1)n(n +1) distances are now grouped into n +1 sets with 
4n(n — 1) distances in each set. This can be done in such a manner that in the jth 
set no distance involving Q, occurs. Let ἡ; be the average of the distances in the jth 
set. Here 


such that 


n; < d,(E) (15.5.7) 


for each j since we are dealing with distances between n points in E. Then by (A4) 
and Lemma 15.1.3 


A(6 x) = A(m, ἢ: 09 Mn+ 1) < d,(E). (15.5.8) 
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Hence 


for all e > 0. Thus the sequence is non-increasing and the limit (15.5.5) exists. al 


This set function do(E; A) has certain properties of continuity and monotony. 
Thus we get 


Lemma 15.5.2. If E, < E,, then 
dy(E,; A) < do(E,; A). (15.5.10) 


Proof. Any choice of n points of E, is also a choice of n points in E,. Thus the 
supremum in the second case must be at least as large as in the first case. Since this 
holds for all n, it holds also for the limits. Ε 


This is as far as we can go. Even if E, is quite a small subset of E,, the two 
transfinite diameters may be equal. Thus in Fekete’s case the circular disk and its 
perimeter have the same transfinite diameters. In fact, this is quite natural and may 
be expected in much more general cases, for the points that maximize the average for 
n points will usually find the boundary as the best possible location. 

We have also 


Lemma 15.5.3. Let E, be the set of points having a distance from E not 


exceeding &, then 
lim do(E,) = do (E). (15.5.11) 
εἰ0 


Proof. This is essentially an exercise in carrying out repeated limiting processes in 
the appropriate order. We choose an integer n, a positive number ἢ, and n points 
Q:, Q», .... Q, in E, so that their average distance A(Q) > d,(E,) — η. Now for 
each Ὁ; there is a point P, in E such that 


a(P;, Qj) < ε, 
and we may assume that P; τέ P, if 7 #k. Then 


d(Q;, Οὐ < d(P;, P,) + 2e 
and | 
d,(E,) — ἡ < AL..., (P;, Py) + 2 ε, ...]. (15.5.12) 


Now A is a continuous function in each of its N = 4n(n — 1) arguments at the 
point in ΕΝ whose coordinate at the place (j, k) is d(P;, P,). Hence we can find a 
o(&) which goes to zero with 8 such that 


A[..., A(P;, P,) + 28, ...] < Al..., U(P;, P,), ...] + o(e) < d,(E) + o(e). 


Hence 
d,(E,) < d,(E) + o(e) + ἡ. -(15.5.13) 
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Here the left member is >d)(E,) so that 
d)(E,) < d,(E) zB a(é) ἘΠ ῆ. 


The left member is a non-decreasing function of ε, so it tends to a limit as δ. 0 
and o(e) > 0. Hence 

lim d)(E,) < 4,(E) + ἡ 

εἰ0Ὸ 


for every n. Now let n > o0 to obtain 
lim d)(E,) < do(E) + ἡ 
εἰ0 ᾿ 


for every ἡ > 0 and hence also for 47 = 0. On the other hand, the limit in the left 
member is at least d)(E) and (15.5.11) holds. Jf 


EXERCISE 15.5 


1. Verify that if E is a line segment and A is the geometric mean, then d)(£Z; A) is a 
quarter of the length of the segment. 


2. What is the answer if A is the arithmetic mean instead? 


3. Find d)(E; A) for the unit disk and the geometric mean. [ Hint: Assume that for each 
n the nth roots of unity define a maximizing configuration by reasons of symmetry. 
The value of the integral [δ Ιορ 5ἰπ 1 dt = --- πῖορ 2 may be required. | 


4. Find ώ0(Ε;: Μ,) for the unit bal] in R”. 


15.6 THE GEBYSEV CONSTANTS 


This is another class of set functions obtained by an averaging process. The original 
definition dealt with sets in the complex plane and the use of geometric means. 
Given a bounded closed set E in the complex plane, consider the set {P,} = P, of 
all polynomials of degree n in the complex variable z with leading coefficient 1. 
The absolute value of each such polynomial attains a maximum in E. For what 
polynomial in the set P, is the maximum as small as possible? This is a minimax 
problem. There exists a unique polynomial T,(z; E) for which the minimax is 
assumed. This is known as the nth CebySev polynomial for the set E after the 
Russian mathematician Pafnuti Livovié CebySev (1821-94), who first considered 
such questions. The transliteration of Russian names varies from one language to 
another. Here we are using the Czech spelling, which 15 that used by Mathematical 
Reviews. The “Τ᾽ in T,(z) is a relic of older spellings starting with “ΤΟΙ or 
“‘Tsch’’. 
We now set 


M, = M(E) = | max IT.(z; B)I| co (15.6.1) 
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It may be shown that the sequence: {M,} converges to a limit known as the 
CebySev constant of E 

C(E) = lim M,,(E). (15.6.2) 
It was shown by Fekete that his transfinite diameter d)(E) coincides with C(E). 

Now the absolute value of a polynomial of degree n and leading coefficient 1 
is simply the product of the distances from a variable point z to n fixed points, 
Z1, Z2, .-.» Z,, the roots of the polynomial P,, under consideration. If we extract the 
nth root of the absolute value, we are simply taking the geometric mean of the 
distances. 

This is a familiar situation which invites generalizations. We take an arbitrary 
complete metric space X and a bounded infinite set E in ἃ. Take an averaging 
process A satisfying the postulates and such that the interval of definition (a, b) 
contains (0, d(E)] where d(E) is the topological diameter of E. Take now n points 
P,, P2,..., P, in X but not necessarily in E and form the A-average of the distances 
from a point P to Ρ,, Ps, ..., P,, 


A[d(P, P,), d(P, P2), ..., 4(Ρ, P,)] = g(P). (15.6.3) 


This function g(P) has a supremum for P in E. We now ask: What is the infimum 
of the set of suprema for a given integer ἡ and arbitrary choice of the points P, to 
P,,? We set 
inf sup g(P) = M,,(E). (15.6.4) 
g PeE 


The sequence {M,(E)} is bounded, for if all points P, as well as P are in E, then 
d(P, P;) cannot exceed d(E) for any j, and the same bound must then apply to 
M,(E) for alln. Actually the sequence is convergent, but to prove this we need some 
further inequalities. 


Lemma 15.6.1. We have 


Min+n(E) < max [M,,(E), M,(E)] (15.6.5) 
for all m and n. In particular, 
M,(E) < M,(&), Vn. (15.6.6) 
More generally, 
M,,(E) < M,(E), Vk,n. (15.6.7) 


Proof. If ¢ > 0 is given, we can find two functions of P: 
g,(P) = A[d(P, P,), d(P, P,), ΟΡ, P»)I; 
g2(P) -- Α[4(}, Q,), a(P, Q>), vee d(P, Q,)], 


whose suprema on E are within ¢ of M,, and M,, respectively. We then form the 
function | 


g(P) = A[d(P, P,), ..., d(P, P,,), UP, 04). ..., d(P, Q,)] 
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whose supremum on E 15 at least M,,.,(E). By (A,) we have for all P 
g(P) = AL σ΄ (Ρ), .... 91(P), ga(P)s ..., 92(P)] 
with m entries g,(P) and n entries g,(P). This gives 
g(P) < max [g;(P), g2(P)], 
and on E the right member does not exceed 
max [M,,(E) + «, M,(E) + ¢] = max [M,,(E), M,(E)] + «. 


This implies (15.6.5) since ¢ > 0 is arbitrary. The two remaining inequalities are 
simply special cases of (15.6.5) combined with (A,). Jj 


For a fixed E < ¥ the sequence {M,(E)} is a valuation sequence in the sense of 
Section 13.4 by virtue of (15.6.5). We refer to Exercise 13.4 for various ways in 
which such a sequence may behave. The main point here is that a valuation sequence 
need not converge. This means that additional information is required to ensure 
convergence. 

Lemma 15.6.2. For any number b > d(E) 

M,,+1(E) < A[M,, M,, ..., Μ,» δ) (15.6.8) 


where M,, figures n times. 


Proof. Weuse the function g,(P) defined above and take an arbitrary point Q in E 
and form with. entries g,(P), 


g(P) = A[g2(P), g2(P), ..., 92(P), dP, Q)]. (15.6.9) 


The supremum of g(P) on E is at least Μ,. ,(E). On the other hand, on E the right 
member is at most 


A[M,,(E) + ες, M,(E) + &, .... M,(E) + 6, d(B)]. 

Since ¢ is arbitrary (15.6.8) results. ἢ 

With the aid of these inequalities we shall now prove the convergence of 
{M,(E)}. 

Theorem 15.6.1. The sequence {M,E} converges to a limit 

C(E; A) = inf M,(E), (15.6.10) 

called the Ceby%ev constant of the set E with respect to the average A. 
Proof. Problems 7, 8, and 9 under Exercise 13.4 suggest the following 
arrangement of the proof. Set 


B = lim sup M,(£). (15.6.11) 


n> 
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We shall show that β < M, for all n, so that a unique limit must exist and equal β. 
Let us first assume that there is a number y and an integer 7 such that 


ME) <y,  Mya(E) <7. (15.6.12) 


By Problem 8, Exercise 13.4, every integer n > 72 admits of a representation of the 
form — 
n=pj+qU+)), (15.6.13) 


where p and 4 are non-negative integers. This combined with Lemma 15.6.1 shows 
that for such values of n 


M,(E) < max [M,(E), M;.,(E)] < 3. (15.6.14) 


That is, if j and y exist for which (15.6.12) holds, then necessarily B < y. The proof 
proceeds by showing that if 
M, <6 


for some n, then j and y exist with B >'y and this contradiction establishes that 
B < M,(E) for all n. Here is where the second lemma comes in handy. 
Suppose that for some m we have M,,(E) = a < B. Then for all 7 of the form 


n = 2*m 
we would also have M, < a from (15.6.7). For such an n we use (15.6.8) to obtain 
M,+,(E) < A[M,(E), M,(E), ....M,(E), d(E)] < Alo, α, ..., α, d(E)]. 


As n goes to infinity with k, the last member converges to « by Lemma 15.3.1. It 
follows that (15.6.12) holds for some large value of j, say j = 2m, if we take for y 
some number greater than «. Since a < β, we can take y < β and this gives the 
contradiction. Hence we must have 


M,(E) 2 B = lim sup Μ,(Ε) (15.6.15) 
Pp? oa 


for all m, and this proves the existence of lim M,(E). ἢ 


Here we cannot claim equality between the transfinite diameter d)(E; A) and 
the CebySev constant C(E; A) in general, even though equality holds for the 
geometric mean and a complex plane set. Already Polya and Szeg6 found counter 
examples. On the other hand, we do have an inequality. 


Theorem 15.6.2. For every admissible choice of the average A 
C(E; A) < do(E; A). (15.6.16) 


Proof. In Ewechoosen +1 points P,, P2,..., P,, P,+1 i such a manner that the 
average distance between the first m points exceeds d,(E) — ¢ for a given ¢ > 0. 
Thus 

A[d(P;, P,); 1 <j<k <n] >4,(E) -- «. 
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On the other hand, if we average over all n + 1 points, the result is at most d,,,(E), 
so that | 


dy 4 (E) 2 A[d(P,, P,)] -- A[d(P 4 19 P,), A(P y+ 1> P,), 4529 AP + is ΡῈ 
| a(P,, P,), a(P,, P3), ia aU(P,+ 1» Ρ.}]. 


Here P,,, is at our disposal and may be chosen so that the average of the n 
distances from P,,,, is arbitrarily close to the supremum of 


g(P) = A[d(P, P,), d(P, P2), ..., d(P, P,,)], PeE, 


which in its turn is at least equal to M,,(E). The average of the remaining 4n(n — 1) 
distances is at least equal to d,(E) — «by the choice of the first n points. Hence 


d,+1(E) 2 A[M,(E) Ny woes M,(E) τῆ. d,(E) στ Gy vey d,(E) Ν 8]. 


Here ἡ is small positive, there are entries of the first kind and 4n(m --- 1) of the 
second. Since ¢ and ἡ are arbitrarily small, we can let these numbers tend to zero 
and appeal to the continuity of A to obtain 


d, + ,(E) 2 A[M,(E), ....M,(E), d,(E), .--, d,(E)]. 
Here we note that d,(E) > d,,,(E) and that A is strictly increasing. Hence 
d, + 1(E) > A[M,(E), ....M,(E), dy+1(E), ..., dra 1 (EDI. (15.6.17) 


At this point we fall back on the generating function of A and, to fix the ideas, we 
assume that fis strictly increasing. This gives 


5 | 
714....(Ε}] 2 f[AG..] = {nf LM, (E)] + 4n(n — 1) f Ld,41(E I}. 
n(n + 1) 
This simplifies to 
and 
d,4,(E) > M,(E) > C(E; A) (15.6.19) 


by (15.6.10). Passing to the limit with n we get (15.6.16). ἢ 


EXERCISE 15.6 


In the first nine problems X¥ is the complex plane and A is the geometric mean. 


1. Show that a polynomial P,(z) Ε P,, which has zeros outside the closed convex hull H 
of E cannot be a minimizing polynomial. The closed convex hull is the least closed 
convex set containing E. 


2. Prove the existence of Τ᾿, (Ζ; E). [Hint: The maxima of |P,(z)| in E form a bounded 
set, so the infimum exists and is a positive number (why positive 2). There exists a 
sequence of elements in P,, for which the maxima converge to the infimum. Select 
subsequences of polynomials whose zeros converge to limits in H.| 
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3. Show that the minimizing polynomial 7,(z;E) is unique if the equation 
max [Τ᾽ (2; ΕἾ = [M,(E)]" has n + 1 distinct roots in E. [Hint: For a competing 
polynomial, U,(z), consider the set c7,(z) + (1 — c) U,(z), 0 < c <1, and discuss 
maxima. | 


4. The original CebySev polynomial (of the first kind) of degree n is most conveniently 
defined by 


T,(z) = 27"{[z + (2? -- 1)37" + [z — (2 — 1)*]"} = 2'-" cos (marc cos 2). 
Show that these two expressions are equal and are actually polynomials in z of degree 
n and leading coefficient 1. Determine zeros and maxima and minima. 


5. Use the criterion of No. 3 or otherwise to show that 7,(z) is the unique CebySev 
polynomial of degree n for the interval [ —1, 1]. 


6. If S,(z) = 2" *T,,(z), verify the composition property 
(5,9 S,)(Z) = δ,ιί.5,(2)] = δ,,,κ(2). 
7. Solve the Schréder equation (see Problem 19, Exercise 14.2) 
| 7 [5,(2)] = Af(@). 
8. Find the Cebysev constant for the interval [—1, 1]. 
9. Show that the CebySev constant for the unit disk equals 1. 


In the remaining problems ¥ and A are arbitrary unless otherwise restricted. 


10. If E, Ξ E,, show that C(E,; A) < C(E); A). 
11. Compare the CebySev constants for the power means 
C(E;M,) and C(E;Msg,), « < B. 
12. Show that Problem No. 1 generalizes to any B-space ¥ and any arbitrary A. Only 


those functions g(P) of formula (15.6.3) need be considered for which the points P; 
lie in the convex hull of Ε. 


15.7 SOME EXAMPLES 


It is of some interest to examine the unit ball U of a Banach space X from the point 
of view of transfinite diameters. It is clear that no matter what average we use (with 
domain containing [0, 2]) the transfinite diameter d)(U; A) cannot exceed 2, which 
is the topological diameter. But in a surprisingly large number of cases the value 2 
is actually attained. 


Theorem 15.7.1. d)(U; A) = 2 for all admissible A if X is one of the spaces 
C[a, b], L,(a, b), L.,(a, 5), 1, m. 


Proof. The idea of the proof is the same in all these cases. For any natural 
number n we may exhibit 1 elements which are unit vectors two units apart. We 
consider the two cases C[a, b] and /,, leaving the others to the reader. 
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In the case of C[a, b], divide the interval into n equal parts. On each subinterval 
we erect an isosceles triangle of height 1. For the function f; the jth triangle points 
upwards, all other triangles pointing downwards. Here || f;|| = 1 and | f; —All = 2 
for 7 #k. The average distance between these n elements f; is congue 2, so 
d,(U) = 2 for each n and this gives d)(U; A) = 2. 

A similar construction works in /,. Take the sequence of unit vectors {x,} 
where x, has a one in the kth place and zeros elsewhere. Then ||x,|| =1 and 
ΙΧ; — X&l| = 2 forj#k. ἢ 


The CebySev constants of the unit ball are subject to the following restriction. 


Theorem 15.7.2. The Ceby%ev constant of the unit ball in a B-space cannot 
exceed I. 


Proof. We have to consider 


inf sup A[||x — x;|]; 7 =1,2,...,7] (15.7.1) 
{xy} ΠΧ|{Ξ 1 
where the x,’s are ἡ arbitrary elements οἵ X. If we take all x; = 0, the supremum 15 
the sup of ‘ix, ie. 1, so the inf sup can be at most 1. : 


These two theorems show that there may very well be a wide gap between the 
values of the transfinite diameter and the corresponding CebySev constant. Thus in 
general the inequality (15.6.16) cannot be replaced by equality. The actual deter- 
mination of CebySev constants for unit balls of Banach spaces of infinite dimension 
does not seem to have been attacked. 

Let us return to the transfinite diameter. A pertinent question is the following: 
Is there a choice of A most natural to a given metric space X? Is there any sense in 
saying that such and such a definition of transfinite diameter is the most natural? 
What criteria are applicable to settle such a question? Should we give preference to 
a definition for which transfinite diameter and CebySev constant coincide? Then 
none of the spaces mentioned in Theorem 15.7.1 would admit of a natural 
definition. A criterion which is sometimes used is to require that the unit ball 
should have transfinite diameter unity. Again such a criterion cannot be applied 
to the cases mentioned above. Still another criterion is to require that the function 
f defining the average be an elementary potential function in the space. This gives 
the geometric mean with f(s) = log s in R*, the harmonic mean with f(s) = 1/s in 
ΒΕ, the power mean M,_, with f(s) = s*~"in R”. This would make the arithmetic 
mean with f(s) = s the natural choice in R! since linear functions are “harmonic’”’ 
in this space. Beyond these cases the connections with potential theory become 
rather tenuous. As we shall see in the next section, however, it is possible to build 
a potential theory around almost any function f which defines a mean value. The 
existence of infinitely many parallel potential theories valid in the same space is a 
fact. The choice is merely transferred from transfinite diameters to potential 
theories. 

The question of finding transfinite diameters of the unit ball in L,(a, b) or, 
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equivalently, in a Hilbert space leads to problems which are solvable unless the 
chosen averaging process creates trouble. Actually the solution in a large number 
of cases is given by Theorem 15.4.1. 


Theorem 15.7.3. Let U be the unit ball in 1,,χ{(--π, π) or any isometric and 
isomorphic image thereof. Thend,(U; A) 2 ./2 for all A. We have d)(U; A) = ./2, 
if the generating function f of A satisfies one of the conditions (1) to (4) of 
Theorem 15.4.1. If f(s) = s?, 2 < p, then dy(U; A) = 2*~*””. 


Proof. The set of functions 
[- (2π) ἢ ek = 0, +1, +2,... (15.7.2) 


is an orthonormal system for the space L,(—72,7) and the distance between 
distinct elements of the set is constantly equal to ./2. Hence for every ἡ we can find 
n elements of the space of norm 1 whose distance from each other equals /2. It 
follows that d,(U; A) > ./2 for every choice of A and all n and this gives 
d)(U; A) > ./2 as asserted. | 

To prove the remaining assertions we have to analyze the results obtained in 
Theorem 15.4.1. This.theorem states that under the listed conditions the regular 
n-simplex maximizes the average of the lengths of the edges of any n-simplex 
inscribed in the unit sphere. The space 15 supposed to be an inner product space of 
sufficiently high dimension, at least n — 1. There is no need, however, to assume 
that the dimension is exactly n — 1. This assumption, even when explicitly made, 
is used nowhere in the proof. Thus there is nothing to prevent us from taking the 
sphere to be the unit ball of L,(—7z, 72). It then follows that 


- ὟΣ | 
d(U; A) = a 2 (—) (15.7.3) 
and thus d,(U; A) = /2. 

Finally, if A = M, with 2 < p, we conclude from (15.4.8) that 


sey ene) eee 15 
24(U; A) = [-Ξ τ] | (15.7.4) 


and this gives d)(U; A) = 2'~'/? as asserted. Jj 


For the spaces /, and L, definitive results are lacking. 


Theorem 15.7.4. If U is the unit ball in I, or in L,(a, b) with 1 < p, then 
d,(U; A) > 2°”. (15.7.5) 
Proof. Consider /, and the unit vectors (δι). Here the distance between (6;,,) and 


(Sim) 15 21 for allk # j. This gives d,(U; A) > 2'/? for all A and all xn. This implies 
(15.7.5). The case L, is left to the reader. Ε 


The inequality (15.7.5) is possibly the best of its kind for 1 < p < 2. For2 < p 
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better inequalities are known; both the derivation and the result are complicated 
and not very likely to be definitive. At any rate, it appears that for 2 < p the 
transfinite diameter of U is an increasing function of p which converges to 2 as 
p>. 


EXERCISE 15.7 


1. Prove Theorem 15.7.1 for ¥ = σι. 
2. Same question for X = L , (a,b) and L , (a, db). 
3. Prove that Theorem 15.7.1 holds also for the space of functions f holomorphic and 


bounded in the unit disk of the complex plane. The metric is defined by the sup norm. 
[ Hint: The integral powers of z may be worthy of consideration. ] 


4. Let Py be a fixed point in κ΄, n > 2, Pa variable point and r the Euclidean distance 

from Py to P. Consider the generalized Laplacian operator 
n 02 
k=1 Ox; , 

and show that r2~" is annihilated by this operator and thus is a harmonic function 
in Μ΄, 

5. Prove Theorem 15.7.4 for L (a, δ). 

6. Let ὠ, = exp (2mi/n). Then dj, = |w] — Οὗ = 2sin[x(k —j)/n], 1<j<k <n. 
Define d,(Q) as the A-average of the numbers d;,,j <k. Further, prove that 
dy(Q) = lim d,(Q) exists and, if A is generated by f(s), then 


1 
f(dy) = 2 (1 — s) f[2sin zs] ds. 
0 


In particular, for the arithmetic means, d)(Q) = 4/x. [Hint: Use (15.2.3), the 
definition of the Riemann integral and the properties of 17 


7. If X is a B-space over the complex field and if χε U, then all the points asx also 
belong to U. Use this to show that if A is taken as the arithmetic mean, then 


4 
d(U; A) > —. 
π 


15.8 POTENTIAL THEORIES 


In this last section of the chapter and of the treatise we shall give a sketch of the 
type of potential theory that can be developed starting from the generating function 
of an A-average. The discussion will be informal and expository, terms will be left 
somewhat vague and proofs will be mostly lacking. It is desired to whet the reader’s 
appetite, not to satiate it. 
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Classical potential theories are of two kinds: the logarithmic and the 
Newtonian. The first theory is built around the basic harmonic function and the 
two-dimensional Laplacian 


eU aU 
logr and Bye + Oy? = 0 (15.8.1) 
of which it is a solution; in the second, 
Ϊ δ. δῦ δυ _ 


ae + ὭΣ ΠΙᾺ ὭΣ 0 (15.8.2) 


play a similar role. Here r is the distance between a fixed point and a variable point 
in the space. In fourth and higher dimensions we generalize via Laplace’s equation 
and the corresponding basic solution 


r?>-" and Σ = 0. (15.8.3) 


j=i1 Ox? 


Further generalization may be given along two different lines. We may 
generalize the Laplacian or the basic function. The first road leads to a study of 
solutions of linear partial differential equations of the elliptic type. We shall avoid 
this path. Secondly, we note that the basic functions in the cases listed are gener- 
ating functions of A-averages: f defined for s > 0, continuous and strictly mono- 
tone. Given such a function, we can carry over the classical formulas of potential 
theory and construct something that may be considered a potential theory. 

The use of such more general kernels goes back to the early 1930’s with Pdélya 
and Szegd, Marcel Riesz (1886-1969) and Otto Frostman as pathfinders. The 
general notion of capacity (more general than below) has been developed by 
G. Choquet and Lennart Carleson. 

In order to get our bearings, consider the Newtonian case. This begins with 
the notion of Borel sets (there are other alternatives, however, but we should not 
get lost in subtleties). In any topological space X, the o-algebra generated by the 
open sets is called the class of Borel sets (Emile Borel, 1871-1956), here denoted by 
x. Cf. Problem 2, Exercise 4.1. Thus Σ contains all open sets, all closed sets, 
countable unions, and finite intersections. On Σ shall be defined a class of finite, 
non-negative, countably additive set functions w(S). Here Se and p(S) is the 
measure of S in the o-algebra (X, 2, μ). In applications to potential theory “mass” 
or “‘charge’’ is often used instead of ‘“‘measure’’. ““Countably additive’? means that 


oO 


uUPS) = > WS, if S;aS,=O0 for j#k, (15.8.4) 


k=1 


the series being convergent. See Definition 4.1.2. 
If f(x) is real-valued and Z-measurable and E ε &, one can define the so-called 
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Radon-Stieltjes integral of f over E with respect to p (cf. p. 237 et seq.) 


[τῷ du(x) (15.8.5) 


by the usual discussion of approximating sums of the Riemann-Stieltjes type. The 
integral is named after the Austrian mathematician Johann Radon (1887-1956). 
We mention without further explications the possibility of forming product measures 
and establishing the extension of the Fubini theorem for double integrals. 

If X = R*, the Newtonian potential of μ᾽ on E is by definition 


U(x) = ! Ix — γι 1 ἀμ6). (15.8.6) 


Here x and y are vectors in Κ΄ and y is restricted to E. Further, Ee Zand μ(Ε) =1. 
The corresponding energy integral is 


Ιᾳ; E) = | | Ix -- yll“* du(x) du(y). (15.8.7) 


This integral always exists but may be + oo. In order to form these integrals, it is 
not necessary to assume that X = κ΄, but they are physically meaningful iff 
X = Κ΄. 

With Σ as above and ¥ = R® consider now the set I'(E) of all countably 
additive measures, finite and non-negative, defined on © and with μ(Ε) =1. To 
each such measure yp corresponds an energy integral I(u; E). Here it is possible that 
all energy integrals equal +00. Then E is said to be of capacity 0, Newtonian 
capacity to be more precise. 


If E is not of capacity zero, then we set 
inf I(u; E) = V(E), (15.8.8) 
C(E) = [V(E)]“?. (15.8.9) 


Here C(£) is by definition the Newtonian capacity of E. If the boundary dE of E 
is sufficiently smooth, there is a measure ply € T(E) with 


I(Up; E) = V(E). (15.8.10) 


If fo 1S unique, it is known as the equilibrium distribution and the corresponding 
potential function 


Ἱ IIx -- υἱ ἀμ) (15.8.11) 


is known as the equilibrium potential. 

These terms hail from electrostatics. If a conductor E is given and furnished 
with a unit charge, then the charge will distribute itself on the surface of the 
conductor. The distribution and the resulting potential are then known as the 
equilibrium distribution and the equilibrium potential. V(E) is known as Robin’s 
constant [Gustave Robin (1855~97)]. 
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These are the notions which should be generalized. The kernel γ ὦ is replaced 
by an arbitrary kernel K(r) defined for r > 0, continuous and strictly monotone. 
Thus K is the generating function of an A-average. K(0) may be finite or infinite. 
It is convenient to assume K(r) > 0. True, this excludes the logarithmic potential, 
but for the following development it is preferable to deal with positive energies and 
positive potentials. We consider the case X = R™ and the set Σ of all Borel sets 
there. Again let Γ(Ε) be the set of all countably additive non-negative measures yu 
defined on a closed bounded set E € & and its subsets in Σ with μ(Ε) = 1. Define 
the K-potential by 


ἰώ»; 5:3 [ K(Ix — yl) duty) (15.8.12) 
and the corresponding energy integral 
I(u; E, K) = i [, K(x — yll) du(x) ducy). (15.8.13) 


It may happen that the energy integral is infinite for all μ ε Γ(Ε). In this case E 
is said to be of K-capacity 0. This presupposes that K(0) = + οὐ. In any case, 


consider 
V.(E) = inf I(u; E, K) (15.8.14) 


and determine the K-capacity from the equation 


The solution is unique since K is strictly monotone and continuous. 
Suppose now that C,(E)>0. We can then find a sequence of mass 
distributions yp, Ε T(E) such that 


(1) I(u,; E, K) > Vg(E). 
(2) There is a μο € T(E) such that 


[ FAX) ἀμ, > [, ΠΥ (15.8.16) 


for every f continuous on E. 
(3) [(μο: Ε, K) = Vx(E). 


If (15.8.16) holds, we say that μ, converges weakly to po. If now Up is unique, 
it can be obtained by the following construction. We recall that K is the generating 
function of an A-average, say Ax. Since E is compact in R™, we can find, for any 7, 
a set of n points x,, X2, ..., X, in E such that 


Ax(|x; — Χμ; 1 <j<k <n) =4,(B), (15.8.17) 
so that 


(5 K(d(E))= Σ Klay -- all). (15.8.18) 
1<jfek<n 


15.8 POTENTIAL THEORIES 471 


At each point x; we place a mass of 1/n. This defines μ, ε T(E) and a simple 
calculation shows that 


I(u,; E, K) = K[d,(E)] > K[d,(4)], n— 00, (15.8.19) 
whence 
dy (E; Ax) = C,(E). (15.8.20) 


This relation reveals another application of transfinite diameters. For 
K(r) = logr (which does not satisfy our assumptions!) this connection was 
established by Fekete and for K(r) = 1/r by Polya and Szegé. 

In this discussion we have assumed that C,(E) > 0. If this is not true, then the 
sequence of potential functions corresponding to py, need not converge weakly to a 
limit. 

Among the properties of the equilibrium potential, the following should be 
noted: 


(1) U(x; Uo, K) > V,(E) for all x € E, excepting at most a set of K-capacity 0. 

(2) U(x) < Vx(E) everywhere on the support of μο. 

(3) There is a constant B depending only upon the dimension m of the space 
such that U(x) < BV,(E) for all x. 


The notion of a Green’s function may also admit of an extension to 
K-potentials. Green’s function will then be of the form 


G(x, z; E) = K(||x — τῇ) — [ K(x -- yll) d,v(y, 2). (15.8.21) 


It is a symmetric function of x and z and leads to the representation 


u(z) = { u(y) d,v(y, Z) (15.8.22) 


for any K-potential corresponding to a distribution of mass on E. 

As the last topic to be mentioned briefly here we consider the relation between 
the transfinite diameter and conformal mapping discovered by Fekete and 
mentioned in Section 15.5. | 

Let E be a bounded continuum in the complex plane whose complement is 
connected. E has a transfinite diameter with respect to the geometric mean and for 
each n there is a set of n points {z,,;j =1,2,...,m} in E such that 


[Π l2jn -- Zenl = [4,(Ε}}}, Ν - πᾷ -- Ἰ). (15.8.23) 


1<j<k 


These points are in general not uniquely determined but this does not affect what 
follows. For each x there exists a set {z,,} with the aid of which we can form an πίῃ 
Fekete polynomial for E, 


F(z; E) = F,(2) = Π (2 — Zi). (15.8.24) 
j=1 
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Set 
F(z) = LF, (z)1*", (15.8.25) 


where the nth root is so chosen that 


lim z~1 f(z) = 1. 
The functions f, are holomorphic in the complement of E. A basic property of the 
Fekete polynomials (here stated without proof) is that 


lim max | f,(z)| = ὦ. (Ε), (15.8.26) 


n>o ΖΕΕ 


the geometric transfinite diameter of E. This property ensures the existence of a 
subsequence of { f,} which converges everywhere on the complement of E to a limit 
function f, uniformly on compact sets. As a matter of fact, the whole sequence 
converges to ἢ This function fis holomorphic on the complement and it is univalent, 
1.6. Ζ; # 22 implies f(z,) # f(z2). Furthermore, it maps the complement of E 
conformally on the exterior of the circle 


lw] = do(E). (15.8.27) 


Since f has for large values of |z| an expansion of the form 
jQ=2+) Az (15.8.28) 
k=0 


it follows that fis the function which maps the complement of E on the exterior of 
the circle with center at the origin and radius r,, the exterior conformal mapping. 
radius of E. Thus 

d)(E) = r,(E), (15.8.29) 


which is the identity proved by Fekete. 
On the other hand, if μ, is the distribution of mass on E where each point Zin 
carries the mass 1/n, then 


—log | f,(2)| -ἰ log ——— du,(t). (15.8.30) 


Ι 
ΙΖ -- ἢ 
ΑΚ ἢ -- οὐ, the left member converges to —log | f(z)| and yu, converges weakly to μο; 
the equilibrium distribution on E, so that 


~log | f(z)| = [ log —— ἢ diol) (15.8.31) 


ΙΖ 
which is the equilibrium potential U(z; μο, E). All three functions are harmonic in 
the complement of E. U(z; μο. E) approaches V(E) on GE, excepting at most a set 
of logarithmic capacity 0. The first member of (15.8.31) approaches —log do(E) 
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almost everywhere on dE, the exceptional set again being of logarithmic capacity 0. 
Hence 
—log d)(E) = V(E) = —log C(E) (15.8.32) 
or 
d,(E) = C(E), (15.8.33) 


so that the transfinite diameter also equals the logarithmic capacity of E as observed 
by Szegé. 

The coefficients A, in (15.8.28) can be expressed in terms of the moments of the 
equilibrium distribution 


M, = [favo (15.8.34) 


via the series for the logarithmic derivative 
ἢ 7 00 
ΓΘ Ξ My (15.8.35) 
F(Z) Ko 
The actual formulas for A, in terms of the M, are fairly complicated, see the 
exercise below. The expressions coincide with the Newton—Waring formulas for 
the kth symmetric function of n variables in terms of the corresponding power 
sums. 


EXERCISE 15.8 


1. Prove that I(u; E, K) is finite when K(0) is. 


2. Verify (15.8.19). 
3. Verify (15.8.29). 
4. Show that the exterior mapping function satisfies 
— log f(z) = Ϊ log 7 Molt), —m <argz <7, 
E Ζ- 


where the logarithm has its principal value. Use this to verify (15.8.35). [Hint: The 
harmonic conjugate of the left member of (15.8.30) is obtained by replacing the 
kernel by its harmonic conjugate. ] 


5. Compute the coefficients 40, A,, Az in (15.8.28). 
6. Show that 
(—1Pt Patt Pe M,\P1 ( Μ2ΔΡ: M,\"« 
A= 2 Oo @pt = a! Gu, ez, 7 Es, 
where the summation extends over those non-negative integers p; such that 
Py t+ 2p, ++ τὸ Κρ =k. 
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7. The potentials of M. Riesz are based on the kernel 


T[G(m — α)}} 


nT (40) 


If a, B, « + β are all in (0, m), the kernel has the elegant composition property 


R,(x) = ΙΧ πὶ O<a<m. 


Ry +p) = [.κῷ R,(x -- 5) ὧδ. 


Verify this for m = 3. It is enough to consider vectors of the form x = (¢, 0, 0). 
Introduce polar coordinates. 


COLLATERAL READING 


For the whole chapter, see the survey article: 


HILLE, E., Topics in Classical Analysis. Lectures on Modern Mathematics, Vol. ΠΙ 
(ed. T. L. Saaty), Wiley, New York (1965). 


Sections 3, 4, 5, and the extensive bibliography bear on the present chapter. The reader is 
referred to this survey for further details. For Section 15.4 see: 


HILLe, E., Some geometric extremal problems, J. Australian Math. Soc. 6 (1966) 
122-8. 


For Section 15.8 see also: 


CARLESON, L., Selected Problems on Exceptional Sets, Van Nostrand, Princeton, 
N.J. (1967). [Bibliography with 1049 entries.] 

CHOQUET, G., Theory of capacities, Ann. Inst. Fourier, Grenoble 5 (1953-4) 131-295. 

ΗΠ, E., Analytic Function Theory, Vol. 2, Ginn, Boston (1962). Sections 16.3 to 
16.5 and 17.3. 


INDEX 


Abelian, 295 
abstract 
holomorphic functions, 249-258 
polynomials, 264, 274 
Riemann-Stieltjes integrals, 230-236 
spaces, 51 
ACZEL, J., 422, 426, 429, 430, 435, 437, 439 
—Hosszu theorem, 430, 448 
uniqueness theorem, 427 
addition 
matrix, 19 
sequence, 86 
theorems, 376, 424, 435 
vector, 2, 8, 52 
additive 
function, 380, 422 
functional, 421 
group, 52, 386 
measure, 113 
set function, 468 
space, 385 
adverse, 279 
ALBERT, A. A., 411 
d’ALEMBERT, J., 434 
algebra 
Abelian, 295 
associative, 20 
Banach, 80-84, 276-304 
center of, 277. 
commutant of, 295 
commutative, 20, 276 
distributive, 20 
Hausdorff, 225 
non-associative, 20 
non-commutative, 20, 276 
quotient, (= residue class) 292 
sigma, 112 
simple, 290 
topological, 225 
almost everywhere, 113 
analytic 


continuation, 256, 262, 271 
Fréchet, 264—275 
Lorch, 275 
X-, 254 
angular semi-module, 386 
antiderivation, 372, 379 
antisymmetry, 162 
approximation theorems 
Bernstein, 101 
Fejér, 153, 157 
Landau, 106 
Lebesgue, 100, 158 
Weierstrass, 99 


approximations, successive, 194-199 
arithmetic means, 95, 428, 434, 440, 448, 


467 
ARZELA, G., 98 
compactness theorem, 98, 151 
ASCOLI, C., 98 
equi-continuity, 97 
automorphism, 79 
average, 437 
A-, 437 
moving, 148 
axes, 1 
axiom of choice, 163 


BACHMAN, G., 50, 85 

ΒΑΙΒΕ, R. L., 160 
category, 160 | 

BANACH, S., 95, 159, 187, 311 
algebra, 80-84, 276-304 
closed graph theorem, 312 
fixed point theorem, 169-172 


Hahn-B. extension theorem, 318 


—Steinhaus theorem, 308 
spaces, 51-56 

BARTLE, R. G., 111 

basis, 3, 11 
orthogonal, 3, 11 


orthogonalization process, 3-7, 12, 69 


475 


476 INDEX 


orthonormal, 3, 12 
BERNSTEIN, S. N., 100, 343 
approximation theorem, 101 
polynomials, 101, 343 
BESSEL, F. W., 71 
functions, 71 
inequality, 71, 150 
BIRKHOFF, G., 248 
BiRKHOFF, G. D., 159 
—Kellogg theorem, 159 
BLASCHKE, W., 394, 411 
function of support, 394 
block, 115 
BOCHNER, S. 
integral, 236-248 
BOHNENBLUST, H. F. 
~Sobczyk theorem, 319 
Bonr, H. 
-almost periodic functions, 353 
BOLZANO, B. 
—Weierstrass theorem, 98, 160 
BONSALL, F. F., 187 
BOREL, E., 468 
Heine-B. theorem, 160 
—property, 160 
sets, 146, 468 
boundedness, 305-310 
principle of uniform, 212—217 
BOUNYAKOVSKY, V. J., 139 
—Schwarz inequality, 139 
BOuRBAKI, N., 15, 409 
BOURLET, C., 435 
operators, 424 
derivation, 424 
multiplication, 426 
substitution, 426 
box product, 395 
BRILLOUIN, L., 416, 435 
Brouwer, L. E. J., 159, 
fixed point theorem, 159 


CACCIOPOLLI, R., 159 
calcul de limites, 201 
capacity 
K-, 470 
logarithmic, 473 
Newtonian, 469 


CARATHEODORY, C., 115, 158, 366 


functional inequality, 366 
measure theory, 115 
CARLEMAN, T., 341, 381, 451 
inequality, 451 
CARLESON, L., 468, 474 


CARLSON, F., 264 
CARTWRIGHT, M., 411 
category, Baire, 160 
Caucny, A. L., 7, 201, 204, 401, 435, 441 
calcul de limites, 201 
estimates, 254 
first theorem, 95 
functional equations, 418-428 
induction argument, 401, 441 
integral, 254 
majorant, 201 
product, 94 
product theorem, 201 
sequences, 54 
equivalence classes of, 160 
CAYLEY, A., 34 
—Hamilton theorem, 34 
operator, 335, 359 
CEBYSEV, P. L., 459 
constants, 459-464 
polynomials, 459, 464 
center of 
algebra, 277 
quadric, 42 
CESARO, E.,95 
(C, 1)-summability, 95, 185 
characteristic equation of matrix, 29, 34 
characteristic function of sets, 134 
characteristic root—see characteristic value 
characteristic space, 37 
characteristic value of 
integral equation, 179, 184 
matrix, 29-33, 36-39, 44-50, 185, 342 
operator, 65, 80, 179, 328, 337, 341 , 
characteristic vector, 37, 44, 45, 80, 328, 
337, 341 
CHOQUET, G., 468, 474 
closed graph theorem, 312 
closure, 310-316 
relation, 151, 154, 158 
cofactor, 22 
commutant, 295 
commutator, 296 
compactness, 98, 160 
conditional, 160 
sequential, 161 
weak, 222 
complement, orthogonal, 76, 332 
completeness, 
of 7, 88, 161 
of Cla, δ]. 97 
of L La, δ], 140-143 


INDEX 477 


weak, 221 kernel, 151 
cone, 13 operator, 360 
positive, 164 product, 94 
conjugation, 78 disphere, 192 
consistency theorem, 450 dissolvent, 280-290 
continuity equations, 287, 417 
modulus of, 97, 145, 149 expansion at infinity, 288 
strong, weak— of vector functions, 226 operational calculus, 303-304 
Strong, uniform, weak- of operator distance, 9, 39-40, 159 
functions, 227 domain 
uniform- scalar functions, 97 of transformation, 14, 305 
continuous functions, 96-107 = open connected set, 264 
continuous spectrum, 328-387 dominated convergence theorem, 136, 240 
ΘΟΠΙΓΑΡΗΘΗ DunrorD, N., 229, 248, 249, 304, 330, 361, 
fixed point theorems, 169-173 430 
mappines) 102,18) addition formulas, 424, 435 


contractive mappings, 173-175 
convergence, 54 
absolute, 230 
dominated (integrals), 136, 240 
monotone (—’’—), 126, 238 
sense of metric, 13, 26, 54, 159 
strong, 218 
unconditional, 230 
uniform, 97 


bounded variation, 229 
integral, 248 


EDELSTEIN, M., 

fixed point theorem, 173 
EGGLESTON, E. G., 408, 411 
Egyptian Institute, 70 
eigen—see characteristic 


weak, 221 element 
convex functions, 400-406 divisor of zero, 277 
convexity, 12 idempotent, 277 
convex solid, 12, 396-399 inverse, invertible, 277 
convolution in maximal, 163 

L,(-o0, ©), 138 member, 13 

L,(-2, 2), 156, 178 negative, 52 


neutral, 165, 279 


coordinates, Cartesian, 1 
nilpotent, 277 


covering, 160 


Cramer, H., 264 positive, 162 
““cryptoanalysis,” 412-417 quasi-nilpotent, 277 
cylinder, 124 regular, 81, 277 
reversible, 279 
DarsBoux, G., 418, 435 singular, 81, 277 
Daroczy, Z., 419, 435 unit, 81 
determinant, 3 zero, 52 
nullity, 17 . endomorphism, 57 
of matrix, 15 EPSTEIN, B., 158, 330, 361 
rank, 17 equality, 52 
determinative inequalities, 363, 364—367 equations 
difference set, 218, 296 characteristic, 29, 34 
differentiability, infinitely many unknowns, 184-186 
strong, weak, 227 linear, 16 
uniform of holomorphic functions, 250 minimal, 34-35 
differential, Fréchet, 266 ordinary differential, 204-211, 373-376, 
differential inequality, 365 415-417 
dimension, 8, 11, 17, 53. equicontinuity, 97 


DIRICHLET, P. G. L., 151 equilibrium 


478 INDEX 


distribution, potential, 469 
equilocally bounded, 272 
equivalence, 134 

class, 136, 160 

modulo maximal ideal, 292 

extension 

Bohnenblust-Sobczyk theorem, 319 

Hahn-Banach theorem 

principal, 297 
exterior conformal mapping radius, 472 


FANTAPPIE, L., 249 
FATOU, P., 
lemma, 136, 239 
FeJER, L., 151 
kernel, 151 
theorem, 157 
FEKETE, M., 456, 460, 471 
exterior conformal 
mapping radius, 472 
polynomial, 471 
transfinite diameter, 456, 458 
FERMI, E., 
Thomas-F. equation, 416 
field 
rational, 160 
extension of, 420 
scalar, 52 
‘ sigma, 112 
de FINETTI, B., 437 
FISCHER, E., 156 
Riesz—F. theorem, 156 
fixed point theorems 
Banach, 169-188 
Brouwer, 159 
Edelstein, 173-175 
Volterra, 176-177 
Fomin, δ. V., 84 
form 
Hermitian, 42-49 
polar, 43 
quadratic, 42, 49 
FOuRIER, J., 69, 184 
coefficients, 69, 149 
series, 69, 149-158 
transformation, 360 
FRECHET, M., 55, 249 
analytic, 296-304 
differentiable, 
differentials, 265 
functionals on L2, 146 
metric, 87 
FREDHOLM, I., 178-184 


integral equation, 178-184 

FROBENIUS, G., 25 
matrix norm, 25 

FROSTMAN, O., 468 

FuBINI, G., 135 
theorem, 135 

functional equations, 412-436 
Aczél—Ng intern type, 426-436 
addition theorems, 424, 435 
antiderivations, 372, 379 
Bourlet type, 424, 426, 435 
Cauchy and generalizations, 418—423, 435 
“‘cryptoanalysis’’, 412-417 
derivation, 424 
dissolvent, 280, 287-290 
mean value type, 439-449 
multiplication theorems, 424 
Picard, 169 
resolvent, 36, 281-290, 413 
Schroéder, 426, 436 

functional inequalities, 362-379, 380-411 
absurd, 363 
Carathéodory, 366 
classification, 362-364 
determinative, 363, 364—367 
Gronwall, 370 
multiplicative, 376—379 
Nagumo, 366, 373 
restrictive, 363 
trivial, 362 

functionals, 7 
additive on matrices, 421 
bilinear, 7 
bounded, 8 
inner product, 7, 12 
linear, 7, 12, 60-63, 76-77, 317-322 
linear multiplicative, 63, 103, 149 
multiplicative on matrices, 422 
on BV[a, bj, 109 
on (Ια, δ], 103 
on Ls 92 
on L (5), 146 

functions 
abstract holomorphic, 249-258 
additive, 419 

non-homogeneous, 419 

almost periodic, 353 
Bessel, 71 
$8—holomorphic, 282 
bounded variation, 107-111 
Cauchy holomorphic, 249 
characteristic, 134 


Fourier series of, 155 
continuous, 96-107 
continuously differentiable, 190 
Fréchet analytic, 264-275 
gauge (= semi-norms), 40, 317, 383, 396 
Green’s, 316, 471 
indicator, 391 
Lorch analytic, 275 
meromorphic, 181 
piecewise linear, 98 
simple, 125 
spectral, 342, 350~353 
subadditive, 40, 380-400 
subharmonic, 255 
suboperative, 385 

- support, 395 
X-holomorphic, 252 


GATEAUX, R., 249, 264 
(G)-differentiable, 265, 275 
gauge function (= semi-norm), 40, 317, 
383, 396 
circular, 319 
Gauss, C. F. 
arithmetico-geometric means, 451 
GELFAND, I. M., 229, 249, 290 
bounded variation, 229 
integral, 248 
operational calculus, 296-304 
representation theorem, 290-296 
spectral mapping theorem, 302 
spectral radius, 52, 277 
uniqueness theorem, 300 
generator, infinitesimal, 167 
GoursaT, E., 195 
GRAM, J. P., 4 
-Schmidt orthogonalization process, 
3-7, 12, 69 
Gramian, 5, 13 
graph, 306 
GRAVES, L. M., 248, 264 
integral, 248 
Green’s function 
of linear differential equations, 316 
of potential theory, 471 
GRONWALL, T. H., 370, 373, 379 
inequality = lemma, 370 
group 
additive, 52, 386 
of regular elements, 81, 277 
of reversible elements, 279 
growth indicator, 391 


INDEX 479 


Haun, H., 322 
—Banach theorem, 318 
regular, 322 
Havmos, P., 50, 361 
HAMEL, G., 419 
additive non-measurable, 419 
basis, 419 
HAMILTON, Sir W. R., 34 
—Cayley theorem, 34 
Harpy, G. H., 411 
harmonic mean, 447 
HARTMAN, S., 158 
Hauspor Fr, F., 218, 335, 361 
algebra, 225 
positive polynomials, 343 
space, 218 © 
linear, 220 
—Toeplitz theorem, 335-341 
HELLINGER, E., 342 
HELLY, E., 322 
HERMITE, C., 24 
Hermitian form, 42-50 
matrix, 24 
operators, 233 
HIGHBERG, I. E., 264 
HILBerT, D., 183, 249 
space, 69, 331, 342 
HIL., G. W., 184 
Hite, E., 111, 167, 188, 211, 248, 275, 304, 
330, 371, 385, 410, 411, 424, 435, 474 
addition formulas, 424, 435 
functional equations, 169, 188 
geometric extremal problems, 451-456, 
474 
Landau’s inequality, 188 
Taylor’s theorem (semi-groups), 167 
Thomas—Fermi equation, 416, 435 
HOFFMAN, K., 84 
HO_LpeERr, O., 88, 400 
inequality, 88-90, 138 
—Jensen inequality, 400 
means, 450 
Hosszu, H., 422, 426, 430, 435 
Aczél-theorem, 430, 448 
Hsu, I.-C., 112, 162, 212 
partial ordering, 162 
hull, convex, 13 
hypercube, 13 
hyperplane, 17 
of support, 396 


ideals, 63, 290-291 
maximal, 291 


480 INDEX 


proper, trivial, unit, 290 
idempotent, 24, 82, 277, 285, 288, 294 
image, 56, 165 

pre-, 56 
implicit function theorem, 189-194 
independence, linear, 3, 10, 53 
indicator ~~. 

inequality, 392 

of radial growth, 391 
induction 

principle, 86 

retrogressive, 401, 432 
inequality 

Banach, 368 

Carathéodory, 366 

Carleman, 451. 

Cauchy, 3, 7 

differential integral, 356 

functional, 362-411 

Gronwall, 369 

Holder, 88, 138 

indicator, 392 

Kurepa, 169 

Landau—Kallman-Rota, 167 

Minkowski, 90, 139 

Nagumo, 373 

triangle, 12, 54 

Volterra, 369 
infimum, 163 
information theory, 429, 447, 449 
injection, injective, 15, 427 
integral 

Bochner, 236-248 

Lebesgue, 123-136 

Radon-Stieltjes, 103, 469 

Riemann, 112 

Riemann-AStieltjes, 103 

abstract, 230-236 
integral equations 

Fredholm, 181-184 

Volterra, 179-181 
integration of Fourier series, 155 
involution, 24, 137, 334 
Ionescu Tutcea, C. T., 383, 385, 411, 423, 

435 
functional equation, 423 
inequality, 383-385 

suboperative functions, 385 
isometry, 69, 76, 155 
isomorphism, 19, 69, 300 


JACOBSON, N. 
circle product, 279 


JENSEN, J. L. W. V., 400, 411 
convex functions, 400-408 
equation, 428, 434 
inequality, 400 

JORDAN, C. 

curve theorem, 253 
operator, 296 


KALLMAN, R. R. 

Landau-Rota theorem, 167, 183, 361 
K-capacity, 470 
KELLOGG, O. D. 

Birkhoff-fixed point theorem, 159 
KEMPERMAN, J. H. B., 419, 425 
kernel 

Dirichlet, 150 

Fejér, 151 

nullspace, 14, 57, 306 

of integral equation ,178—-184 

resolvent, 180 

symmetric, 183 
Kocu, H. von, 184 
KoLmocorov, A. N., 84, 317, 437 

matrix, 329 

postulates, 437 
KRONECKER, L., 85 


delta, 6 
Kuczma, M., 435 
KUNZE, R., 84 
KUREPA, S. 


inequality, 169 


LANDAU, E., 167 
approximation theorem, 106 
inequality, 167 
—Kallman—Rota theorem, 167, 188, 361 
lattice, 163 
Laurent expansion, 255, 417 
law of 
cancellation, 53 
cosines, 5 
exponents, 19, 166 
LEBESGUE, H., 100 
approximation theorems, 100, 158 
integral, 125 
integration, 123-136 
dominated convergence theorem, 136, 
240 
measurable, 119-123 
measure, 112-119 
modulus of continuity, 145, 149 
monotone convergence theorem, 126, 144 
Riemann-theorem, 150 


spaces, 112-158, (136-149) 
summability, 158 

LINDELOF, E., 204, 263 
majorants, 204, 211 
Phragmén-growth indicator, 391 
Vitali-theorem, 259 


linear transformations, 12-18, 56-60, 305-- 


330 
adjoint, 44, 77, 322-325, 333 | 
bounded, 57, 165, 305-310 
closed, 307, 310-316 
extension of, 307 
Hermitian, 44, 77, 333, 341-361 
inverse, 14, 58, 65, 165, 322-325 
normal, 78, 334, 338-341 
unitary, 18, 78, 334 


X into 3: X equals BV, 110; C, 104; l 


93-94 
LIOUVILLE, J. 
theorem, 257 
LIPSCHITZ, R. 
condition, 165, 192 
LITTLEWOOD, J. E., 411 
Lorcu, E. R., 211, 249 
-analytic, 275 
reflexive space, 221 
LosonczI, L., 419, 435 


majorants, 199-204 
Cauchy, 204, 211 
Lindel6of, 204, 211 
series, 210 
mapping, 14 
bounded, 57, 165, 305-310 
cell-intern, 433 
contraction, 165-169 
contractive, 173-175 
intern, 427, 430 
into, 14 
injective, (= 1-1), 15, 58, 165 
inverse, 14, 58, 165, 322-325 
onto (= surjective), 15, 58, 165 
MaRrTIN, R. S., 264 
mass, 468 
matrix, 16-27 
characteristic equation, 29, 35 
diagonal, 28, 47 
divisor of zero, 25 
Hermitian, 24 
idempotent, 24 
infinite, 184-186 
inverse, 21 
Jacobian, 194 


INDEX 


Kolmogorov, 329 
minimal equation, 35, 41 
nilpotent, 25 
norms of, 25, 41 
nullity, 17 
positive, 198 
rank, 17 
regular, singular, 17 
similar, 23 
spectrum, 29 
symmetric, 24 
transpose, 24 
unit, 21 
unitary, 18 
maximum principle, 256, 272 
mean square convergence, 158 
mean value(s), 437-474 
A-average, 437-451 
arithmetic, 95, 152, 428 
arithmetico—geometric, 451 
functional equations of, 439-449 
geometric, 439, 447 
harmonic, 439, 447 
Holder, 450 
property, 255 
pth power, 95, 140, 447 
measurability 
A-, u-, 113-115, 236, 468 
Lebesgue, 119-123 
of functions, 119-123 
radial, 390 
of sets, 112-119 
measure 
Carathéodory, 115 
inner, 124, 135 
Lebesgue, 112-119 
outer, 114 
space, 113 
unit, 114 
METZLER, R. C., 420, 423, 425 
MICHAL, A. D., 264 
MIKUSINSKI, J., 158 
MILLER, J. B., 169, 188, 379 
antiderivations, 372 
uniqueness theorem, 433 
MINKOWSKI, H., 40 
convexity, 380, 395, 411 
distance, 40 
gauge function, 40, 317, 319 
geometry of numbers, 40 
inequality, 90, 139 
/ -norm, 40, 88 


481 


482 INDEX 


semi-norm, 40 

support function, 395 
minors, 7 

principal, 8 
MiItTRINOVIC, D. S., 379 
MITTAG-LEFELER, G. 

partial fraction series, 181 
Mosius, A. F., 187 

constants, 187 

inversion formula, 187 
modulus of continuity 

for (ἴα, δ], 97 

for L (a, b), 145, 149 
Monotone convergence theorem, 126, 244 
Morera theorem, 258 
multiplication 

element-wise, 19, 64-65, 80, 86 

scalar, 2, 8, 19, 52, 56, 63 

theorem, 376, 424 

operator, 426 
multiplicative inequalities, 376-379 


NacGumo, M. 
inequality, 366, 373 
postulates, 285 
resolvent expansion, 285, 326 
uniqueness theorem, 366, 373 
NAIMARK, M. A., 304 
Naricl, L., 50, 84 
neighborhood 
epsilon, 54 
Hausdorff, 218 
strong, weak, 224 
NEUMANN, C., 177 
series, 177 
NEUMANN, J., von 341, 361 
NEwMAN, M. H. A., 84 
NEWTON, Sir ISAAC 
potential, 458 
—Waring formula, 473 
Na, C. T., 436, 437 
uniqueness theorem, 430-433 
nilpotency, 25 
norm, 2, 9, 25, 40, 41, 55, 57, 306 
differentiability of, 334 
of 1, 88 
ΟΣ,» 136, 137 
sup-, 87, 97 
nullity, 17 
null space, 57, 306 
operational calculus, 296-304 
in Hilbert space, 341-353 
operator 


adjoint, 44, 77, 322-325, 333 
antiderivation, 372 

bounded, 57, 165, 305-310 
Bourlet, 424 

Cayley, 335, 359 

closed, 307, 310-316 
commutator, 296 

conjugation, 24 

contraction, 165-169 
convolution, 138, 156 
derivation, 424 

Dirichlet, 360 

Hermitian, 44, 77, 333, 341-361 
identity, 65 

inverse, 14, 58, 65, 165, 322-325 
involution, 24, 137, 334 
Jordan, 296 

multiplication, 426 

nilpotent, 82, 328 

normal, 78, 334, 338-341 

order preserving, 164, 199 
positive, 199, 345-348 
projection, 24, 79, 332, 343, 350-354 
quasi-nilpotent, 82, 326 
self-adjoint, 77 

shift, 167 

substitution, 426 

unitary, 78 


order (ing) 


partial, 165-165 

preserving, 164, 199 

total, 162 

under inclusion, 162 
ordinate sets, 123 
oscillation reducing, 439 


OsTROWSKI, A., 419, 436 
parallelogram law, 12, 68, 332 
extended, 12, 69, 453 


PARSEVAL, M. A., 
identity, 72, 151 
parts 
of Hermitian operator, 78 
of real-valued function, 121 
PEANO, G., 85 
postulates, 85 
‘pecking order’’, 329 
PERLIS, S. 
circle product, 279 
“perp”, 74 
perspectivity, 430 
PETTIS, B. J., 248 


PHILLPs, R. 5., 248, 275, 304, 330, 385, 410 


PHRAGMEN, E., 263 
growth indicator, 391, 411 
PICARD, E., 169, 194 
transform, 169, 172 
successive approximations, 194-199 
two-point theorem, 433, 436 
PLUCKER, J., 396 
PoINCARE, H., 184 
point, 1, 9, 51 
coordinates, 395 
spectrum, 183, 328 
Poisson, S. D. 
transform, 172 
polar 
form, 43, 77, 333 
plane, 43 
solids, 396 
pole of 
meromorphic function, 183 
polar plane, 43 
polar convex solids, 396 
X-holomorphic function, 255 
Ῥόιϊιγα, G., 394, 411, 456, 468 
indicator, 391 
transfinite diameter, 456, 471 
polynomial 
abstract, 264, 274 
Bernstein, 101 
CebySev, 459, 464 
Fekete, 471 
positive, 344 
Porter, M. B., 258 
positivity, 198-199, 343-348 
postulates 
A-averages, 437-439 
addition, 52 
distance, 40, 54 
equality, 52 
multiplication, 80 
norm, 39, 55 
Peano, 85 
scalar multiplication, 52 
potential 
K, 470 
logarithmic, 456, 465, 472 
Newtonian, 457, 465, 469 
M. Riesz, 474 
theories, 467-474 
power 
abstract, 264 
means, 140 


INDEX 


set, 113 
principle of 

absolute integrability, 240 

maximum, 256-257, 272 

repeated averages, 438 

uniform boundedness, 212-217 
product 

box, 395 

Cartesian, 56 

circle, 279 

cross, 8 

dot, 4 

inner, 4, 9, 67 

scalar, 8, 52 

space, 56 

vector, 8 
projection, 24, 66, 79, 326, 332 
property 

absolute integrability, 129 

mean value, 255 
PYTHAGORAS, 68 

theorem, 69, 332 


q-difference equation, 376 
quadratic form, 42, 77, 333 
quadric surface, 42 
quasi-inverse, 279 
quasi-nilpotent, 82 
quotient algebra, 292 


RADEMACHER, H., 411 
radius, spectral, 83, 278 
Rabon, J., 469 
—Stieltjes integral, 146, 469 
range, 14, 58, 305 
numerical, 335-341 
rank, 17 
residual spectrum, 328, 337 
residue class, 292 
resolution 
of the identity, 34, 50, 333, 343 
spectral, 326 


483 


resolvent, 28-36, 65-66, 83-84, 281-290 


~equations, 36, 281-290, 413 
~kernel, 180-184 


spectral representation of—32, 50, 342, 


359 
reverse, revertible, 279 
RiccatTl, J. F. Count, 
equations, 289, 290 
RICKART, C. E., 304 
Riemann-Stieltjes integral, 103 
abstract, 230-236 


484 INDEX 


Riesz, F., 55, 249, 322, 342 
—Fischer theorem, 156 
functionals on C[a, b], 104 
on L (a, δ), 146 
positive operator theorem, 345-346 
splitting theorem, 349-350 
Riesz, M., 464 
potential, 474 
RINIERSE, P. J., 416, 436 
RoBIN, G., 469 
constant, 469 
ROSENBAUM, R. 
subadditive functions, 385, 411 
Rota, G.-C., 
Landau-Kallman-theorem, 167, 168 
RUNGE, C., 258 


scalar multiplication of 
linear transformations, 19, 63 
matrices, 19 
sequences, 86 
vectors, 2, 8, 52 
SCHAEFFER, H. H., 187, 330 
SCHMIDT, E., 4 
characteristic values, 183 
Gram-orthogonalization process, 3-7, 
12, 69 
SCHRODER, E. 
functional equation, 426, 436, 464 
SCHWARZ, H. A., 139 
Bounyakovsky-—inequality, 139 
lemma, 257 
SCHWARTZ, J., 248, 304, 330, 361 
SEBASTIAO E SILVA, J., 264 
semi-group 
of operators, 165, 167, 423 
infinitesimal generator of, 167 
neutral element of, 165 
Poisson, 172 
semi-module, 386-400 
angular, 386 
sequence, 54 
Cauchy, 5 
diagonal, 22 
Spaces, 85-96 
subadditive, 278, 385 
series 
Fourier 
abstract, 70-73 
trigonometric, 149-158 
gap = lacunary, 157 
Taylor, 255 


set 
A-measurable, 113 
closed, open, 54 
cocountable, 113 
compact, 160 
conditionally, 
sequentially, 161 
convex, 13 
dense, nowhere dense, 54 
difference, 218, 296 
dissolvent, 279 
—function, 114, 253, 456, 459, 468 
infimum, 163 
lim, inf, sup, 119 
measurable, 113 
ordinate, 123 
power, 113 
product, 296 
resolvent, 65, 83, 96, 109, 277, 325 
spectral, 326 
sub-, 13 
sum-, 387 
supremum, 163 
void, 112 
simplex, 452 
simplicial problem, 452 
singular points, 375 
rate of growth at, 374, 415-417 
regular, irregular, 376 
sinusoid, 395 
SOBCZYK, A. 
Bohnenblust—theorem, 319 
space 
abstract, 51 
adjoint = dual, 60, 218 
Banach, 51—56 
characteristic = eigen, 37 
complex Euclidean, 1~50 
conjugate, 88, 137, 221 
Euclidean three space, 1-13 
Hausdorff, 218 
Hilbert, 69-80 
inner-product, 67-80, 331-361 
linear 
operator, 63-67, 305-330 
vector, 51-84 
measurable, 113 
measure, 113 | 
metric, 39-41, 54-55, 159-188 
null, 14 
pre-Hilbert, 67 
reflexive = regular, 221 


root, 36 
second dual, 221 


ΜΔ, ἢ] 


¥ equals 
BY, 103, 107-111; (Ια, δ], 96-107; 
C*{a, b], 105; C®, 105; C”, 8-13; 
Lebesgue, 112-156 (136-149); 
Mt", 18-28; WtC|a, b], 104; R”, 9; 
sequence, 85-96 
spectral 
mapping theorem, 296, 302, 354-361 
properties, 328 
theorem, 354-361 
spectrum, 277, 279 
continuous, 328 
point, 183, 328 
residual, 328 
STOLL, R. R., 50 
STONE, M. H., 341, 361 
structure 
algebraic, 51 
metric, 51, 189 
topological, 217 
subadditive functions, 380-399 
subadditivity, 12, 40, 380-400 
suboperative functions, 385 
successive approximations, 194-204 
successor, 85 
summability, 449-451 
Cesaro, 95, 185 
Fejér, 152, 157 
Lebesgue, 158 
supremum, 163 
essential, 137 
surjection, surjective = onto, 15 


system orthogonal, orthonormal, 12, 69-73 


SZEGO, G., 456, 468 
logarithmic capacity, 473 
transfinite diameters, 456, 457, 471 
SZEKERES, G., 172 


tangential coordinates, 395 
TARGONSKI, G. I., 424, 426, 435 
Bourlet operators, 424, 426 
TAYLOR, A. E., 84, 249 264, 
Taylor expansion, 167, 255 
theorem 
Aczél, 427 
—Hosszu, 430, 436 
—Ng, 430-433 
addition, 376, 424, 435 
Arzela, 98, 151 


INDEX 485 


Banach fixed point, 169-172 
-Steinhaus, 308 


Darastara, 101 
Birkhoff—Kellogg, 159 
Bohnenblust—Sobezyk, 319 
Bolzano-—Weierstrass, 98, 160 
Brouwer, 159 
Carlson—Cramér—Wigert, 264 
Cayley—Hamilton, 34 
closed graph, 312 
consistency, 450 
dominated convergence, 136, 240 
Fejér, 157 
Fubini, 135, 469 
Gelfand 
representation, 290-296 
uniqueness, 300 
Hahn-Banach, 318 
Heine—Borel, 160 
identity, 255 
implicit function, 189-194 
Jordan, 253 
Landau—Kallman-Rota, 167, 183, 361 
Lebesgue approximation -s, 100, 158 
Liouville, 257 
monodromy, 263 
monotone convergence, 238 
Morera, 258 
Nagumo -s, 285, 326, 366, 378 
Ng, 430-433 
Phragmén-Lindel6éf, 263 
positive operator, 345-346 
pythagoras, 69, 332 
Runge, 258 
spectral, 354-361 
mapping, 296, 302, 354-361 
splitting, 349-350 
Toeplitz-Hausdorff, 356-341 
Vitali-Lindel6f, 258-264, 272-274 
uniform boundedness, 217 
Weierstrass approximation, 99 


THomaS, L. M. 


—Fermi equation, 416 


TOEPLITZ, O., 335, 361 


~—Hausdorff theorem, 335-341 


topologies, 217-226 


Hausdorff, 218 
strong, 217 
weak, 218, 220 


torus, 172 
transformation (see also linear transform- 


ation, mapping and operator) 


486 ΙΝΌΕΧ 


bounded linear, 56-60, 63--66, 305--310 space, 52 


one-parameter semi-group, 167 VINCZE, E., 426, 436 
bounded non-linear, 165 VITALI, G., 258 

semi-group, 165 theorem, 258-264, 272-274 
closed, 307, 310-316 VOLTERRA, V., 176 
Fourier, 360 fixed point theorem, 176-177 
Picard, 169 inequality, 369 
Poisson, 172 integral equations, 179-181 


translation-invariant, 448 


truncation, 123 WALTER, W., 379 


WARING, E. 
uniform boundedness principle, 212-217 Newton-formula, 473 
uniqueness theorems for WEDDERBURN, J. H. MACLAGAN, 25 


differential equations, 365-366 matrix norm, 25 
fixed points, 169-179 WEIERSTRASS, K., 99, 179 


functional equations, 426-436 algebraic addition theorems, 424 


unit approximation theorem, 99 
element, 21, 86, 277 Bolzano-theorem, 98 
matrix, 21 WeYL, H., 342 
of length, 1 WIENER, N., 249 


operator = identity, 65 WIGERT, S., 264 
sphere, 13 Won, E. T., 50 


vector, 2 | Yoopo, B., 15 


YOSIDA, K., 167, 187 
valuations, non-Euclidean, 409-410, 461 


vector, 1, 9, 51 ZERMELO, E., 
addition, 2, 8, 52 axiom of choice, 163 
negative, positive, 1 ZORN, M., 163, 264 
negative of, 52 lemma, maximal principle, 163 
positive, 164 (F)-differentiability, 269 


ABCDE7987654321 


El 


THE AUTHOR 


Einar Hille, Professor Emeritus at Yale University, is cur- 
rently Visiting Professor at the University of New Mex- 
ico. He was educated in Sweden, where he received the 
Ph.D. degree from the University of Stockholm. Before 
accepting an appointment at Yale, he served on the fac- 
ulties of the University of Stockholm and Harvard and 
Princeton Universities. Dr. Hille has been a visiting pro- 
fessor at Stanford University, the University of Chicago, 
the University of California, Irvine, the University of 
Oregon, and at the Universities of Nancy, Mainz, Stock- 
holm, Uppsala, Tata Institute, and Australian National 
University. He was also a Fulbright lecturer at the Sor- 
bonne. Dr. Hille is the author of a number of mathemat- 
ics texts, including Lectures on Ordinary Differential 
Equations (Addison-Wesley, 1969). He is past president 
of the American Academy of Arts and Sciences, the 
National Academy of Science, and the Royal Swedish 
Academy of Science. His field of specialization is mathe- 
matical analysis. 


Printed in U.S.A. 


OTHER BOOKS OF INTEREST 


LECTURES ON ORDINARY DIFFERENTIAL EQUATIONS 
By EINAR HILLE, Yale University 723 pp, 20 illus (1969) 


Designed for graduate-level courses, this book stresses the theory of ordinary 
differential equations and places special emphasis on behavior in the complex 
domain and the extension of the classical results to abstract valued solutions. 
“ .. the author has included a great deal of material that is not given in other 
recent treatments of the subject, such as the theory of equations in a Banach 
algebra, and an extensive treatment of equations in the complex domain.” 
Mathematical Review 


MEASURE AND INTEGRATION, Second Edition 
By M. E. MUNROE, University of New Hampshire 290 pp, 12 illus (1971) 


This text presents measure theory from the abstract or postulational point of 
view, centering on “Caratheodory” measure theory. In light of recent CUPM 
recommendations, a new chapter on functional analysis has been added as 
well as a thorough discussion of weak and weak” convergence in the standard 
function spaces. The text is suitable for courses in measure theory at the 
graduate level. 


ANALYSIS Ii (REAL ANALYSIS) 
By SERGE LANG, Columbia University 476 pp (1969) 


This graduate text on basic real analysis, being retitled Rea/ Analysis in its 
second printing, contains point-set topology, measure theory, two basic spe- 
cial theorems of functional analysis, and applications of integration theory to 
manifolds. 


INTRODUCTION TO PARTIAL DIFFERENTIAL EQUATIONS: 

FROM FOURIER SERIES TO BOUNDARY VALUE PROBLEMS 

By ARNE BROMAN, Chalmers University of Technology, Goteborg, Sweden 
179 pp, 33 illus (1970) 


This text, based on a course that has been given for over 25 years at Chalmers 
University of Technology, is designed to give the student a basic knowledge 
of Fourier analysis and its applications. Courses in calculus, linear algebra, 
ordinary differential equations, and complex analysis are prerequisite. 
Exercises are given for every section of each chapter in the book—both 
routine exercises and more difficult ones to challenge the above-average 
student, 


ADDISON-WESLEY PUBLISHING COMPANY 
Reading, Viassactiuse ‘ts 
Menic Park, California * London + Don Miils, Qnotarie 


